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21. I X I The following fees are submitted: 
BASIC NATIONAL FEE (37 CFR 1.492 (a) (1) - (5)): 



CALCULATIONS pto use only 



□ 

13 
□ 
□ 
□ 

ENTER APPROPRIATE BASIC FEE AMOUNT = 

Surcharge of $ 1 30.00 for furnishing the oath or declaration later than 
I 1 20 I X 1 30 months from the earliest claimed priority date (37 CFR 1 .492 (e)). 



Neither international preliminary examination fee (37 CFR 1.482) 
nor international search fee (37 CFR 1.445(a)(2)) paid to USPTO 
and International Search Report not prepared by the EPO or JPO $1040.00 

International preliminary examination fee (37 CFR 1.482) not paid to 

USPTO but International Search Report prepared by the EPO or JPO $890.00 

International preliminary examination fee (37 CFR 1.482) not paid to USPTO 

but international search fee (37 CFR 1 .445(a)(2)) paid to USPTO $740.00 

International preliminary examination fee (37 CFR 1.482) paid to USPTO 

but all claims did not satisfy provisions of PCT Article 33(l)-(4) S710.00 

International preliminary examination fee (37 CFR 1 .482) paid to USPTO 

and all claims satisfied provisions of PCT Article 33(l)-(4) $100.00 



890.00 



130.00 



CLAIMS 



NUMBER FILED NUMBER EXTRA 



RATE 



Total claims 



113-20 



93 



18.00 



1 ,674.00 



Independent claims 



15-3 ^ 



12 



84.00 



1,008.00 



MULTIPLE DEPENDENT CLA!M(s) (if applicable) 



+ 



280.00 



280.00 



TOTAL OF ABOVE CALCULATIONS = 



3,982.00 



□ 



Applicant claims small entity status. See 37 CFR 1 .27. Tlie fees indicated above 
are reduced by 



SUBTOTAL = 
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Processing fee of $ for furnishing the English translation later than 

I 1 20 30 months from the earliest claimed priority date (37 CFR 1 .492 (0). + 



TOTAL NATIONAL FEE = 



3,982.00 



Fee for recording the enclosed assignment (37 CFR 1.21 (h)). Assignment 
must be accompanied by appropriate cover sheet (37 CFR 3.28, 3.31) 
( per property). 



TOTAL FEES ENCLOSED = 



3,982.00 



Amount to be 
Refunded: 



Charged: 



□ 



A check in the amount of $ 



3,982.00 



to cover the above fees is enclosed. 



in the amount of $ 
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to cover the above fees. A duplicate copy of this sheet is enclosed. 

c. X The Cotnmissioner is hereby authori2:ed to charge any additional fees which may be required or credit 



any overpayment to my Deposit Account No. 



06-2375 . A duplicate copy of this sheet is enclosed. 



NOTE: Where an appropriate time limit under 37 CFR 1.494 or 1.495 lias not been met, a petition to revive 
(37 CFR 1.137 (a) or (b)) must be filed and granted to restore the applUa^Siflo pending status^ 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Patent Application of: 
Edward Burton, et al. 



U.S. Patent and Trademark Office 
Commissioner for Patents 
Washington, D.C. 20231 

Dear Sir: 

Submitted herewith for filing in connection with the above-referenced patent 
application is a labeled, computer readable copy of the Sequence Listing included in the 
application. 

I hereby state that I have reviewed the paper copy of the Sequence Listing contained, 
as required by 37 CFR 1.821(e), and the computer readable form of the Sequence Listing, as 
required by 37 CFR 1.821(c), and that the content of the paper and computer readable copies 
are the same. 

Early favorable consideration of the patent application is respectfiilly solicited. 
Dated: April 4, 2002 Respectfolly^bmitted, ^ 
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Docket No.: HO-P02428US0 
(PATENT) 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Patent Application of: 
Edward Burton, et al. 

Application No. : UNKNOWN Group Art Unit: N/A 

Filed: April 4, 2002 Examiner: Not Yet Assigned 

For: UTROPHIN GENE PROMOTER 

FIRST PRELIMINARY AMENDMENT 

Box Non-Fee Amendment 

Commissioner for Patents 
Washington, DC 20231 
Dear Sir: 

Prior to examination on the merits, please amend the above-identified U.S. patent 
application as follows: 

In the Claims 

Please add the following new claims 53-111. 

53 . An isolated nucleic acid comprising a promoter which comprises a sequence 
of nucleotides selected from (i) the human promoter sequence shown in Figure 1 and (ii) the 
mouse promoter sequence shown in Figure 2, free or substantially free of utrophin codmg 
sequence. 

54. An isolated nucleic acid consisting essentially of a promoter which comprises 
the sequence of nucleotides shown 5' to position 1440 in Figure 1 . 

55. An isolated nucleic acid consisting essentially of a promoter which comprises 
the sequence of nucleotides shown 5' to position 1 183 of the mouse sequence shown in 
Figure 2. 



I hereby certify that this correspondence is being deposited with the U.S. 
Postal Service as Express Mail, Airbill No. EU098493497US, in an 
envelope addressed to: Box Non-Fee Amendment, Commissioner for 
Patents, Washington, DC 20231, on the date shj»vj« bel^j 



Dated: April 4, 2002 



Signati 




(I>^lissa W. Acosta) 
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56. An isolated nucleic acid consisting essentially of a promoter which comprises 
the nucleotides numbered 1 199 -1440 in the sequence shown in Figure 1. 

57. An isolated nucleic acid consisting essentially of a promoter which comprises 
the nucleotides niimbered 959-1 183 in the sequence shown in Figure 2. 

58. An isolated nucleic acid consisting essentially of a promoter which comprises 
the nucleotide sequence ACAGGACATCCCAGTGTGCAGTTCG. 

59. An isolated nucleic acid consisting essentially of a promoter which comprises 
a sequence of nucleotides that is an allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the promoter sequence shown in Figure 

1 , which sequence has at least 60% homology with the promoter sequence shown in figure 1 
and which promoter, when operably linked to a sequence of nucleotides, has the ability to 
initiate transcription of that sequence, said transcription being muscle-specific. 

60. An isolated nucleic acid consisting essentially of a promoter which comprises 
a sequence of nucleotides that is an allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the promoter sequence shown in Figure 

2, which sequence has at least 60% homology with the promoter sequence shown in figure 2 
and which promoter, when operably linked to a sequence of nucleotides, has the ability to 
initiate transcription of that sequence, said transcription being muscle-specific. 

61 . An isolated nucleic acid consisting essentially of a promoter which comprises 
a sequence of nucleotides that is an allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the promoter sequence shown in Figure 
2, which hybridises to the promoter sequence shown in figure 2 under stringent hybridisation 
conditions and which promoter, when operably linked to a sequence of nucleotides, has the 
ability to initiate transcription of that sequence, said transcription being muscle-specific. 

62. A nucleic acid construct comprising an isolated nucleic acid according to any 
of the preceding claims operably linked to a heterologous sequence. 

63. A nucleic acid construct comprising an isolated nucleic acid according to any 
one of claims 53 to 61 operably-linked to a coding sequence. 
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64. A nucleic acid construct according to claim 63 wherein said coding sequence 
encodes a reporter molecule. 

65. An in vitro host cell comprising a nucleic acid construct according to claim 63. 

66. An in vitro host cell comprising a nucleic acid construct according to claim 64. 

67. A method comprising culturing a host cell according to claim 65 under 
conditions for expression of the peptide or polypeptide encoded by said coding sequence. 

68. A method as claimed in claim 67 wherein said coding sequence encodes a 
reporter molecule. 

69. A method according to claim 67 comprising detection of transcription of said 
coding sequence. 

70. A method according to claim 67 comprising detection of expression of the 
peptide or polypeptide encoded by said coding sequence. 

71. A method of screening for a substance able to modulate utrophin promoter 
activity, the method comprising contacting an expression system containing a nucleic acid 
construct according to claim 63 with a test or candidate substance and determining 
transcription of said coding sequence or expression of the peptide or polypeptide encoded by 
said coding sequence. 

72. A method as claimed in claim 63 wherein said coding sequence encodes a 
reporter molecule and said reporter molecule is detected. 

73. A method according to claim 71 wherein the expression system comprises a 
host cell containing said nucleic acid construct. 

74. A method which comprises, following identification of a substance able to 
modulate utrophin promoter activity in accordance with a method according to claim 71, 
manufacture of the substance £ind/or use of the substance in manufacture or formulation of a 
composition. 
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75. The use of an isolated nucleic acid according to any of claims 53 to 58 for 
promoting transcription of an operably linked sequence of nucleotides. 

76. The use of claim 75 wherein the transcription is tissue-specific, with the 
tissue-specificity being muscle-specific. 

77. An isolated nucleic acid molecule comprising a nucleotide sequence encoding 
a polypeptide including the amino acid sequence shown in Figure 1 or Figure 2. 

78. An isolated nucleic acid molecule comprising a nucleotide sequence encoding 
a polypeptide that is an allele, mutant or derivative of a polypeptide including the amino acid 
sequence shown in Figure 1, which amino acid sequence has at least 60% homology with the 
polypeptide sequence in Figure 1 or Figure 2. 

79. An isolated nucleic acid molecule comprising a nucleotide sequence encoding 
a polypeptide that is an allele, mutant or derivative of a polypeptide shown in Figvire 1 or 
Figure 2, which nucleotide sequence hybridises with the nucleotide sequence encoding the 
polypeptide in Figure 1 or Figure 2 imder stringent hybridisation conditions. 

80. An isolated nucleic acid molecule comprising a nucleotide sequence encoding 
a polypeptide having the amino acid sequence shown in Figure 9. 

81 . An isolated nucleic acid molecule comprising the nucleotide sequence shown 
in Figure 9. 

82. A nucleic acid of any one of claims 77 to 8 1 comprised in a vector. 

83. A nucleic acid according to any one of claims 77 to 81 comprised in an 
expression vector. 

84. An in vitro host cell containing an expression vector according to claim 83. 

85. A method including introduction of nucleic acid according to any of claims 77 
to 81 into a cell. 



86. 

vector. 
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87. A method according to claim 85 wherein said introduction takes place in vitro. 

88. A method as claimed in claim 85 which includes causing or allowing 
expression of said polypeptide encoding nucleotide sequence in a cell. 

89. A method according to claim 88 wherein the cell is part of a manmial. 

90. A method according to claim 88 wherein the expression product is purified 
and/or isolated following expression. 

91 . A method according to claim 90 wherein the expression product is formulated 
into a composition which includes at least one additional component, following purification 
and/or isolation of the expression product. 

92. An isolated polypeptide as encoded by nucleic acid according to any of claims 
77 to 81. 

93. An isolated utrophin exon IB polypeptide selected from: 

(i) human utrophin exon IB polypeptide of which the amino acid sequence is shown in 
Figure 1 ; and 

(ii) mouse utrophin exon IB of which the amino acid sequence is shown in Figure 1. 

94. An isolated polypeptide including the human polypeptide according to claim 

93. 

95. An isolated polypeptide including the mouse polypeptide according to claim 

93. 

96. An isolated polypeptide which has 60 % homology with the polypeptide 
according to claim 94 or 95. 

97. An isolated fragment of a polypeptide according to claim 93, which fragment 
is 5 to 25 amino acids in length. > 

98. An isolated fragment of a polypeptide according to claim 93, which fragment 
is 10 to 20 amino acids in length. 



25152336.1 



5 



Application No.: UNKNOWN 



Docket No.: HO-P02428US0 



99. An antibody specific for a polypeptide according to any one of claims 92 to 

96. 

100. A composition including a polypeptide according to claim 92 and a 
pharmaceutically acceptable excipient. 

101. A composition including a polypeptide according to any one of claims 92 to 

98 and a pharmaceutically acceptable excipient. 

102. A composition including a polypeptide according to claim 94 and a 
pharmaceutically acceptable excipient. 

103. A composition including a fragment according to claim 97 or claim 98 and a 
pharmaceutically acceptable excipients. 

104. A composition including an antibody according to claim 99 and a 
pharmaceutically acceptable excipients. 

105. A method for treating a dystrophin phenotype in a mammal, which comprises 
administering a nucleic acid according to any one of claims 77 to 81 in a therapeutically 
effective amount. 

106. A method as claimed in claim 105 wherein said nucleic acid is an expression 

vector. 

107. A method for treating a dystrophin phenotype in a mammal, which comprises 
administering a polypeptide according to claim 92 in a therapeutically effective amount. 

108. A method for treating a dystrophin phenotype in a mammal, which comprises 
administering a polypeptide according to any one of claims 93 to 95 in a therapeutically 
effective amount. 

1 09. A method for treating a dystrophin phenotype in a mammal, which comprises 
administering a polypeptide according to claim 96 in a therapeutically effective amount. 
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110. A method for treating a dystrophin phenotype in a mammal, which comprises 
administering a fragment according to claim 97 or claim 98 in a therapeutically effective 
amount. 

111. A method for treating a dystrophin phenotype in a mammal, which comprises 
administering an antibody according to claim 99 in a therapeutically effective amount. 
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REMARKS/ARGUMENTS 



Claims 1-52 were in the original PCT application as filed. Applicants have canceled 
claims 53-1 11, without prejudice or acquiescence and have added claims 53-1 11. Claims 53- 
111 delete the multiple dependency and clarify the claims without prejudice or acquiescence. 
Applicants assert that no new matter has been added. 



Applicants have added claims 53-1 1 1 to delete the multiple dependency and to clarify 
the claims without prejudice or acquiescence. Claims 53-1 1 1 have been canceled without 
prejudice or acquiescence. Therefore, these amendments do not narrow the scope of the 
claims within the meaning of Festo Corp. v. Shoketsu Kinzoku Kogyo Kabushiki Co., Ltd., 
234 F.3d 558, 586, 56 USPQ2d 1865, 1886 (Fed. Cir. 2000). 

In view of the above, each of the presently pending claims in this application is 
believed to be in immediate condition for allowance. Accordingly, the Examiner is 
respectfully requested to pass this application to issue. 



CONCLUSION 



Dated: April 4, 2002 
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Houston, Texas 77010-3095 
(713)651-5151 



(713) 651-5246 (Fax) 
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UTROPHIN GENE PROM OTER 

The present invention is based on cloning of a genomic 
promoter region of the human utrophin gene and of the mouse 
utrophin gene . 

The severe muscle wasting disorders Duchenne muscular 
dystrophy (DMD) and the less debilitating Becker muscular 
dystrophy (BMD) are due to mutations in the dystrophin gene 
resulting in a lack of dystrophin or abnormal expression of 
truncated forms of dystrophin, respectively. Dystrophin is a 
large cytoskeletal protein (427kDa with a length of 125nm) 
which in muscle is located at the cytoplasmic surface of the 
sarcolemma, the neuromuscular junction (NMJ) and myotendinous 
junction (MTJ) . It binds to a complex of proteins and 
glycoproteins spanning the sarcolemma called the dystrophin 
associated glycoprotein complex (DGC) . The breakdown of the 
integrity of this complex due to loss of, or impairment of 
dystrophin function, leads to muscle degeneration and the DMD 
phenotype . 

The dystrophin gene is the largest gene so far identified in 
man, covering over 2.7 megabases and containing 79 exons . The 
corresponding 14kb dystrophin mRNA is expressed predominantly 
in skeletal, cardiac and smooth muscle with lower levels in 
brain. Transcription of dystrophin in different tissues is 
regulated from either the brain promoter (predominantly active 
in neuronal cells) or muscle promoter (differentiated myogenic 
cells, and primary glial cells) giving rise to differing first 
exons . A third promoter between the muscle promoter and the 
second exon of dystrophin regulates expression in cerebellar 
Purkinje neurons. Recently reviewed in (Tinsley, et al (1994) 
Proc Natl Acad. Sci U S A 9±. 83 07-13, Blake, et al (1994) 
Trends in Cell Biol. 4: 19-23 , Tinsley , et al (1993) Curr Opin 
Genet Dev. 3: 484-90). 
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There are various approaches which have been adopted for the 
gene therapy of DMD, using the mdx mouse as a model system. 
However, there are considerable problems related to the number 
of muscle cells that can be made dystrophin positive, the 
5 levels of expression of the gene and the duration of 

expression (Partridge, et al . (1995) British Medical Bulletin 
51: 123-137). It has also become apparent that simply re- 
introducing genes expressing the dystrophin carboxy- terminus 
has no effect on the dystrophic phenotype although the DGC 
10 appears to be re-established at the sarcolerama (Cox, et al . 

(1994) Nature Genet 8: 333 -339 , Greenberg, et al . (1994) Nature 
Genet 8: 340-344) . 

In order to circumvent some of these problems, possibilities 
of compensating for dystrophin loss using a related protein, 

15 utrophin, are being explored as an alternative route to 

dystrophin gene therapy. A similar strategy is currently 
being evaluated in clinical trials to up-regulate foetal 
haemoglobin to compensate for the affected adult-globin chains 
in patients with sickle cell anaemia (Rodgers, et al . (1993) N 

20 Engl J Med. 328: 73-80 , Perrine, et al . (1993) N Engl J Med. 
328 : 81-86) . 

Utrophin is a 3 95kDa protein encoded by multiexonic 1Mb UTRN 
gene located on chromosome 6q24 (Pearce, et al . (1993) Hum Mol 
Gene. 2: 1765-1772). At present the tissue regulation of 

25 utrophin is not fully understood. In the dystrophin deficient 
mdx mouse, utrophin levels in muscle remain elevated soon 
after birth compared with normal mice,- once the utrophin 
levels have decreased to the adult levels (about 1 week after 
birth) , the first signs of muscle fibre necrosis are detected. 

30 However there is evidence to suggest that in the small calibre 
muscles, continual increased levels of utrophin can interact 
with the DGC complex (or an antigenically related complex) at 
the sarcolemma thus preventing loss of the complex with the 
result that these muscles appear normal . There is also a 
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siibstantial body of evidence demonstrating that utrophin is 
capable of localising to the sarcolemma in normal muscle . 
During fetal muscle development there is increased utrophin 
expression, localised to the sarcolemma, up until 18 weeks in 
the human and 20 days gestation in the mouse. After this time 
the utrophin sarcoleramal staining steadily decreases to the 
significantly lower adult levels shortly before birth where 
utrophin is localised almost exclusively to the NMJ. The 
decrease in utrophin expression coincides with increased 
expression of dystrophin. See reviews (Ibraghimov 
Beskrovnaya, et al . (1992) Nature 355, 696-702 Blake, et al. 
(1994) Trends in Cell Biol,.'^-. 19-23 , Tinsley, et al. (1993) 
Curr Opin Genet Dev. 3: 484-90). 

Thus, in certain circumstances utrophin can localise to the 
sarcolemma probably at the same binding sites as dystrophin, 
through interactions with act in and the DGC. Accordingly, if 
expression of utrophin is sufficiently elevated, it may 
maintain the DGC and thus alleviate muscle degeneration in 
DMD/BMD patients (Tinsley, et al. (1993) Neuromuscul Disord 3, 
537-9 . ) . 

However, manipulation of utrophin expression and screening for 
molecules able to upregulate expression is hampered by the 
limited understanding of utrophin expression regulation and 
its promoters . We have previously isolated a promoter element 
lying within the CpG island at the 5 • end of the utrophin 
locus that is active in a broad range of cell types and 
tissues, and shown it to be synaptically regulated in vivo 
(Dennis, et al . (1996) Nucleic Acids Res 24, 1646-52 and WO 
96/34101) . The sequence contains a consensus N-box, a 6bp 
motif important in the regulation of other genes expressed at 
the NMJ (Koike, et al . (1995) Proc Natl Acad Sci USA 92, 
10624-10628) . Localisation of utrophin at the NMJ in mature 
muscle is partially attributable to enhanced transcription of 
utrophin at sub- junctional myonuclei, with consequent synaptic 
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accumulation of tiiRNA (Gramolini, et al. (1997) J Biol Chem 
272, 8117-20, Vater, et al . (1998) Molecular and Cellular 
Neuroscience 10, 229-242) . The utrophin promoter drives 
synaptic transcription of a reporter gene in vivo; this 
5 expression pattern is abolished by point mutations within the 
N-box (Gramolin, et al . (1998) J Biol Chem 273, 736-43) , 

The present inventors hypothesised that utrophin might be 
transcribed from more than one promoter, an important 
consideration for the following reasons: First, it may be 

10 \indesirable to interfere with the mechanisms underlying 

synaptic regulation of genes, as this might affect expression 
of other post -synaptic components and impair the structure and 
function of the NMJ; a promoter without synaptic regulatory 
elements might be a more suitable target for pharmacological 

15 manipulation. Second, cardiac dysfunction is a common feature 
of the dystrophinopathies (Hoogerwaard, et al . (1997) J" Neurol 
244, 657-63, Sasaki, et al . (1998) Am Heart J 135, 937-44); if 
the cardiac utrophin message was transcribed from a different 
promoter, then it might prove necessary to up-regulate this. 

20 Finally, inclusion of additional regulatory sequences might 
increase the yield of a screening program to identify small 
molecules capable of transcriptional activation of utrophin. 

We have now identified an alternative promoter lying within 
the large second intron of the utrophin gene, SOkb 3' to exon 

25 2. The promoter is highly regulated, expressed in a wide range 
of tissues and has little similarity to the synaptically 
expressed promoter. This promoter drives transcription of a 
widely expressed unique first exon that splices into a common 
full-length mRNA at exon 3. This unique exon (called exon IB) 

3 0 encodes a novel 31 amino acid N- terminus for the utrophin 
protein which may be involved in binding to the muscle 
membrane . The sequences of the two utrophin promoters are 
dissimilar, and we predict that they respond to discrete sets 
of cellular signals . 
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Exon IB is primarily considered herein to encode the indicated 
31 amino acids. However, the splice occurs within a codon for 
aspartate. This aspartate residue is common to both isoforms 



5 residue may be included C-terminal to the 31 amino acids to 
provide a 32 amino acid peptide, which may be joined to 
additional amino acids, for instance additional utrophin 
sequence as discussed. See, for instance. Figure 8 for one 
embodiment . 

10 These findings significantly contribute to the understanding 
of the molecular physiology of utrophin expression and are 
important because the promoter reported here provides an 
alternative target for transcriptional activation of utrophin 
in DMD muscle. This promoter does not contain synaptic 

15 regulatory elements and might, therefore, be a more suitable 
target for pharmacological manipulation than the previously 
described promoter. 

We have now cloned this alternative utrophin promoter and 
exon, and the present invention in various aspects and 
2 0 embodiments is based on the sequence information obtained and 
provided herein. 

One major use of the promoter is in screening for substances 
able to modulate its activity. It is well known that 
pharmaceutical research leading to the identification of a new 

25 drug generally involves the screening of very large numbers of 
candidate siibstances, both before and even after a lead 
compound has been found. This is one factor which makes 
pharmaceutical research very expensive and time-consuming. A 
method or means assisting in the screening process will have 

30 considerable commercial importance and utility. Substances 

identified as upregulators of the utrophin promoter represent 
an advance in the fight against muscular dystrophy since they 
provide basis for design and investigation of therapeutics for 



of utrophin . 



In embodiments of the invention an aspartate 
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In vivo use. 

In one aspect, the present invention provides an isolated 
nucleic acid comprising a promoter, the promoter comprising a 
sequence of nucleotides shown in Figure 1 or Figure 2 . The 
5 promoter may comprise one or more fragments of the sequence 
shown in Figure 1 of Figure 2 sufficient to promote gene 
expression. The promoter may comprise or consist essentially 
of a sequence of nucleotides 5' to position 1440 in Figure 1 
(human) or position 1183 in Figure 2 (mouse) . Preferably the 
10 promoter comprises or consists essentially of nucleotides 1199 
to 1440 of the human sequence shown in Figure 1, or the 
equivalent sequence in mouse, e.g. nucleotides 959 to 1183 of 
Figure 2 . 

An even smaller portion of this part of the sequences shown in 
Figure 1 of Figure 2 may be used as long as promoter activity 
is retained. Restriction enzymes or nucleases may be used to 
digest the nucleic acid, followed by an appropriate assay (for 
example as illustrated herein using lucif erase constructs) to 
determine the minimal sequence required. A preferred 
embodiment of the present invention provides a nucleic acid 
isolate with the minimal nucleotide sequence shown in Figure 1 
or Figure 2 required for promoter activity. The minimal 
promoter element is situated between the Pvull restriction 
site at position 1199 in the human sequence and the 
transcription start site at 1440 bp in the human sequence and 
between nucleotides 959 to 1183 in the mouse sequence (see 
Figure 2) . 

In one embodiment a promoter according to the present 
invention comprises or consists of sequence that is shown in 
30 Figure 3 to be conserved between the human and mouse 
sequences, e.g. the 25 nucleotide sequence: 

ACAGGACATCCCAGTGTGCAGTTCG spanning the transcriptional start 
site . 



15 



20 



25 



wo 01/25461 PCT/GBOO/03800 

7 

The promoter may comprise one or more sequence motifs or 
elements conferring developmental and/or tissue-specific 
regulatory control of expression. For instance, the promoter 
may comprise a sequence for muscle- specif ic expression, e.g. 
an E-box element /myoD binding site, such as CANNTG, preferably 
CAGGTG . 

Other regulatory sequences may be included, for instance as 
identified by mutation or digest assay in an appropriate 
expression system or by sequence comparison with available 
information, e.g. using a computer to search on-line 
databases . 

By "promoter" is meant a sequence of nucleotides from which 
transcription may be initiated of DNA operably linked 
downstream (i.e. in the 3' direction on the sense strand of 
double- stranded DNA) . 

"Operably linked" means joined as part of the same nucleic 
acid molecule, suitably positioned and oriented for 
transcription to be initiated from the promoter. DNA operably 
linked to a promoter is "under transcriptional initiation 
regulation" of the promoter. 

The present invention extends to a promoter which has a 
nucleotide sequence which is allele, mutant, variant or 
derivative, by way of nucleotide addition, insertion, 
substitution or deletion of a promoter sequence as provided 
herein. Systematic or random mutagenesis of nucleic acid to 
make an alteration to the nucleotide sequence may be performed 
using any technique known to those skilled in the art . One or 
more alterations to a promoter sequence according to the 
present invention may increase or decrease promoter activity, 
or increase or decrease the magnitude of the effect of a 
substance cLble to modulate the promoter activity. 



wo 01/25461 PCT/GBOO/03800 

8 

"Promoter activity" is used to refer to ability to initiate 
transcription. The level of promoter activity is quantifiable 
for instance by assessment of the amount of mRNA produced by 
transcription from the promoter or by assessment of the amount 
of protein product produced by translation of mRNA produced by 
transcription from the promoter. The amount of a specific 
mRNA present in an expression system may be determined for 
example using specific oligonucleotides which are able to 
hybridise with the mRNA and which are labelled or may be used 
in a specific amplification reaction such as the polymerase 
chain reaction. Use of a reporter gene as discussed further 
below facilitates determination of promoter activity by 
reference to protein production. 

In various embodiments of the present invention a promoter 
which has a sequence that is a fragment, mutant, allele, 
derivative or variant, by way of addition, insertion, deletion 
or substitution of one or more nucleotides, of the sequence of 
either the human or the mouse promoters shown in Figures 1 and 
2, respectively, has at least cibout 60% homology with one or 
both of the shown sequences, preferably at least about 70% 
homology, more preferably at least about 80% homology, more 
preferably at least about 90% homology, more preferably at 
least about 95% homology. The sequence in accordance with an 
embodiment of the invention may hybridise with one or both of 
the shown sequences, or the complementary sequences (since DNA 
is generally double- stranded) . 

Similarity or homology (the terms are used interchangeably) or 
identity is preferably determined using GAP, from version 2 0 
of GCG. This uses the algorithm of Needleman and Wunsch to 
align sequences inserting gaps as appropriate to improve the 
agreement between the two sequences . Parameters employed are 
the default ones: for nucleotide sequences - Gap Weight 50, 
Length Weight 3, Average Match 10.000, Average Mismatch 0.000; 
for peptide sequences - Gap Weight 8 , Length Weight 2 , Average 
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Match 2.912, Average Mismatch -2.003. Peptide similarity 
scores are taken from the BLOSUM62 matrix. Also useful is the 
TBLASTN program, of Altschul et al . (1990) J. Mol , Biol. 215: 
403-10, or BestFit, which is part of the Wisconsin Package, 
Version 8, September 1994, (Genetics Computer Group, 575 
Science Drive, Madison, Wisconsin, USA, Wisconsin 53711) . 
Sequence comparisons may be made using FASTA and FASTP (see 
Pearson & Lipman, 1988. Methods in Enzyraology 183: 63-98). 
Parameters are preferably set, using the default matrix, as 
follows: Gapopen (penalty for the first residue in a gap) : - 
12 for proteins / -16 for DNA; Gapext (penalty for additional 
residues in a gap) : -2 for proteins / -4 for DNA; KTUP word 
length: 2 for proteins / 6 for DNA. 

Nucleic acid sequence homology may be determined by means of 
selective hybridisation between molecules under stringent 
conditions . 

Preliminary experiments may be performed by hybridising under 
low stringency conditions. For probing, preferred conditions 
are those which are stringent enough for there to be a simple 
pattern with a small number of hybridisations identified as 
positive which can be investigated further. 

For example, hybridizations may be performed, according to the 
method of Sambrook et al. (below) using a hybridization 
solution comprising: 5X SSC (wherein "^SSC = 0.15 M sodium 
chloride ; 0 . 15 M sodium citrate ; pH 7 ) , 5X Denhardt ' s reagent , 
0.5-1.0% SDS, 100 ^ig/ml denatured, fragmented salmon sperm 
DNA, 0.05% sodium pyrophosphate and up to 50% formamide. 
Hybridization is carried out at 37-42'*C for at least six 
hours . Following hybridization, filters are washed as 
follows: (1) 5 minutes at room temperature in 2X SSC and 1% 
SDS; (2) 15 minutes at room temperature in 2X SSC and 0.1% 
SDS; (3) 30 minutes - 1 hour at 37°C in IX SSC and 1% SDS; (4) 
2 hours at 42-65°C in IX SSC and 1% SDS, changing the solution 
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One coinmon formula for calculating the stringency conditions 
required to achieve hybridization between nucleic acid 
molecules of a specified sequence homology is (Sambrook et 
5 al., 1989): T„ = SLS^C + 16 . 6Log [Na+] + 0.41 {% G+C) - 0.63 
(% formamide) - 600/#bp in duplex. 

As an illustration of the above formula, using [Na+] = [0.368] 
and 50-% formamide, with GC content of 42% and an average 
probe size of 200 bases, the T„ is ST^C. The of a DNA 

10 duplex decreases by 1 - 1.5°C with every 1% decrease in 

homology. Thus, targets with greater than about 75% sequence 
identity would be observed using a hybridization temperature 
of 42 "C. Such a sequence would be considered substantially 
homologous to the nucleic acid sequence of the present 

15 invention. 

It is well known in the art to increase stringency of 
hybridisation gradually until only a few positive clones 
remain. Other suitable conditions include, e.g. for detection 
of sequences that are about 80-90% identical, hybridization 

20 overnight at 42°C in 0.25M Na2HP04, pH 7 . 2 , 6.5% SDS , 10% 

dextran sulfate and a final wash at 55 °C in O.IX SSC, 0.1% 
SDS . For detection of sequences that are greater than about 
90% identical, suitable conditions include hybridization 
overnight at 65°C in 0.25M Na2HP04, pH 7 . 2 , 6.5% SDS, 10% 

25 dextran sulfate and a final wash at 60°C in O.lX SSC, 0.1% 
SDS. 

In a further embodiment, hybridisation of nucleic acid 
molecule to an allele or variant may be determined or 
identified indirectly, e.g. using a nucleic acid amplification 
30 reaction, particularly the polymerase chain reaction (PGR) . 
PGR requires the use of two primers to specifically amplify 
target nucleic acid, so preferably two nucleic acid molecules 
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with sequences characteristic of the utrophin promoter are 
employed. Using RACE PGR, only one such primer may be needed 
(see "PGR protocols; A Guide to Methods and Applications", 
Eds. Innis et al. Academic Press, Nevr York, (1990)). 

5 Thus a method involving use of PGR in obtaining nucleic acid 
according to the present invention may include: 

(a) providing a preparation of nucleic acid, e.g. from a 
muscle cell; 

(b) providing a pair of nucleic acid molecule primers 
10 useful in (i.e. suitable for) PGR, at least one of said 

primers being a primer specific for nucleic acid according to 
the present invention; 

(c) contacting nucleic acid in said preparation with said 
primers under conditions for performance of PGR; 

15 (d) performing PGR and determining the presence or 

absence of an amplified PGR product. 

The presence of an amplified PGR product may indicate 
identification of an allele or other variant. The sequence 

may have the ability to promote transcription (i.e. have 
20 "promoter activity") in muscle cells, e.g. human muscle cells, 
or muscle-specific transcription. 

Further provided by the present invention is a nucleic acid 
construct comprising a utrophin promoter region or a fragment, 
mutant, allele, derivative or variant thereof able to promoter 

25 transcription, operably linked to a heterologous gene, e.g. a 
coding sequence. By "heterologous" is meant a gene other than 
utrophin. Modified forms of utrophin are generally excluded. 
Generally, the gene may be transcribed into mRNA which may be 
translated into a peptide or polypeptide product which may be 

3 0 detected and preferably quantitated following expression. A 

gene whose encoded product may be assayed following expression 
is termed a "reporter gene", i.e. a gene which "reports" on 
promoter activity. 
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The reporter gene preferably encodes an enzyme which catalyses 
a reaction which produces a detectable signal, preferably a 
visually detectable signal, such as a coloured product. Many 
examples are known, including p-galactosidase and lucif erase. 
(3-galactosidase activity may be assayed by production of blue 
colour on substrate, the assay being by eye or by use of a 
spectrophotometer to measure absorbance. Fluorescence, for 
example that produced as a result of lucif erase activity, may 
be quantitated using a spectrophotometer. Radioactive assays 
may be used, for instance using chloramphenicol 
acetyltransf erase, which may also be used in non- radioactive 
assays. The presence and/or amount of gene product resulting 
from expression from the reporter gene may be detearmined using 
a molecule able to bind the product, such as an antibody or 
fragment thereof . The binding molecule may be labelled 
directly or indirectly using any standard technique. 

Those skilled in the art are well aware of a multitude of 
possible reporter genes and assay techniques which may be used 
to determine gene activity. Any suitable reporter/assay may 
be used and it should be appreciated that no particular choice 
is essential to or a limitation of the present invention. 

Expression of a reporter gene from the promoter may be in an 
In vitro expression system or may be intracellular (in vivo) . 
Expression generally requires the presence, in addition to the 
promoter which initiates transcription, a translational 
initiation region and transcriptional and translational 
termination regions . One or more introns may be present in 
the gene, along with mRNA processing signals (e.g. splice 
sites) . 

Systems for cloning and expression of a polypeptide are 
discussed further below. 



The present invention also provides a nucleic acid vector 
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comprising a promoter as disclosed herein. Such a vector may 
comprise a suitably positioned restriction site or other means 
for insertion into the vector of a sequence heterologous to 
the promoter to be operably linked thereto. 

5 Suitable vectors can be chosen or constructed, containing 
appropriate regulatory sequences, including promoter 
sequences, terminator fragments, polyadenylation sequences, 
enhancer sequences, marker genes and other sequences as 
appropriate. For further details see, for example. Molecular 
10 Cloning: a L,ai:iora.tory Manual: 2nd edition, Sambrook et al, 
198 9, Cold Spring Harbor Laboratory Press. Procedures for 
introducing DNA into cells depend on the host used, but are 
well known. 

Thus, a further aspect of the present invention provides a 
15 host cell containing a nucleic acid construct comprising a 
promoter element, as disclosed herein, operably linked to a 
heterologous gene . A still further aspect provides a method 
comprising introducing such a construct into a host cell. The 
introduction may employ any available technique, including, 
20 for eukaryotic cells, calcium phosphate transf action, DEAE- 
Dextran transf ection, electroporation, liposome -mediated 
transf ection and transduction using retrovirus. 

The introduction may be followed by causing or allowing 
expression of the heterologous gene under the control of the 
25 promoter, e.g. by culturing host cells under conditions for 
expression of the gene. 

In one embodiment, the construct comprising promoter and gene 
is integrated into the genome (e.g. chromosome) of the host 
cell. Integration may be promoted by inclusion in the 
30 construct of sequences which promote recombination with the 
genome, in accordance with standard techniques. 
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Many known techniques and protocols for manipulation of 
nucleic acid, for example in preparation of nucleic acid 
constructs, mutagenesis, sequencing, introduction of DNA into 
cells and gene expression, and analysis of proteins, are 
5 described in detail in Current Protocols in Molecular Biology, 
Second Edition, Ausubel et al. eds., John Wiley & Sons, 1994, 
the disclosure of which is incorporated herein by reference. 

Nucleic acid molecules, constructs and vectors according to 
the present invention may be provided isolated and/or purified 

10 (i.e. from their natural environment), in substantially pure 
. or homogeneous form, free or substantially free of a utrophin 
coding sequence, or free or substantially free of nucleic acid 
or genes of the species of interest or origin other than the 
promoter sequence . Nucleic acid according to the present 

15 invention may be wholly or partially synthetic. The term 
"isolate" encompasses all these possibilities . 

Nucleic acid constructs comprising a promoter (as disclosed 
herein) and a heterologous gene (reporter) may be employed in 
screening for a substance able to modulate utrophin promoter 

20 activity. For therapeutic purposes, e.g. for. treatment of 

muscular dystrophy, a substance able to up-regulate expression 
of the promoter may be sought . A method of screening for 
ability of a substance to modulate activity of a utrophin 
promoter may comprise contacting an expression system, such as 

25 a host cell, containing a nucleic acid construct as herein 

disclosed with a test or candidate substance and detearmining 
expression of the heterologous gene. The level of 
transcription of the heterologous gene, or the level of 
heterologous protein may be determined. The level of protein 

3 0 may be determined by measuring the amount of protein, or the 
activity of the protein, using techniques known to those 
skilled in the art. 



Alternatively, or additionally a method of screening for 
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ability of a substance to modulate activity of a utrophin 
promoter may comprise contacting a cell containing an 
endogenous utrophin gene (e.g. a mammalian muscle cell) with, a 
test substance and measuring the level of RNA transcription or 
protein expression using binding members specific for the 
nucleic acid or polypeptides disclosed herein. Specific 
binding members include antibodies and nucleic acid probes . 

The level of expression in the presence of the test s\ibstance 
may be compared with the level of expression in the absence of 
the test siibstance . A difference in expression in the 
presence of the test substance indicates ability of the 
substance to modulate gene expression. An increase in 
expression of the heterologous gene compared with expression 
of another gene not linked to a promoter as disclosed herein 
indicates specificity of the substance for modulation of the 
utrophin promoter. 

A promoter construct may be transfected into a cell line using 
any technique previously described to produce a stable cell 
line containing the reporter construct integrated into the 
genome. The cells may be grown and incubated with test 
compounds for varying times . The cells may be grown in 96 
well plates to facilitate the analysis of large numbers of 
compounds . The cells may then be washed and the reporter gene 
expression analysed. For some reporters, such as lucif erase, 
the cells will be lysed then analysed. Previous experiments 
testing the effects of glucocorticoids on the endogenous 
utrophin protein and RNA levels in myoblasts have already been 
described [12,13] and techniques used for those experiments 
may similarly be employed. 

Constructs comprising one or more developmental and/or time- 
specific regulatory motifs (as discussed) may be used to 
screen for a substance able to modulate the corresponding 
aspect of the promoter activity, e.g. muscle- specific 
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expression . 
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Following identification of a substance which modulates or 
affects utrophin promoter activity, the svibstance may be 
investigated further. Furthermore, it may be manufactured 
and/or used in preparation, i.e. manufacture or formulation, 
of a composition such as a medicament, pharmaceutical 
composition or drug. These may be administered to 
individuals . 

As noted above, the inventors also identified a novel coding 
sequence (Exon IB) which encodes a novel utrophin N-terminus. 

According to a further aspect of the present invention there 
is provided a nucleic acid molecule which has a nucleotide 
sequence encoding a polypeptide which includes the amino acid 
sequence shown in Figure 1 or Figure 2 . 

Such a polypeptide may include other utrophin sequences, and 
the nucleic acid molecule may be in the form of a utrophin 
"mini -gene" (discussed further below) . 

Such a polypeptide may include non-utrophin (i.e. heterologous 
or foreign) sequences and thereby form a larger fusion 
protein. For example, such a fusion protein could be used to 
target a non-utrophin polypeptide to muscle membranes . 

The coding sequence included may be that shown in Figure 1 or 
Figure 2 or it may be a mutant, variant, derivative or allele 
of the sequence shown. The sequence may differ from that 
shown by a change which is one or more of addition, insertion, 
deletion and substitution of one or more nucleotides of the 
sequence shown. Changes to a nucleotide sequence may result 
in an amino acid change at the protein level, or not, as 
determined by the genetic code. 

Thus, nucleic acid according to the present invention may 
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include a sequence different from the sequences shown in 
Figure 1 or Figure 2 yet encode a polypeptide with the same 
amino acid sequence . The amino acid sequences shown in Figure 
1 and figure 2 consist of 31 residues. 

On the other hand the encoded polypeptide may comprise an 
amino acid sequence which differs by one or more amino acid 
residues from the amino acid sequences shown in Figure 1 or 
Figure 2 . Nucleic acid encoding a polypeptide which is an 
amino acid sequence mutant, variant, derivative or allele of 
the sequences shown in Figure 1 and Figure 2 are further 
provided by the present invention. Nucleic acid encoding 
such a polypeptide may show at the nucleotide sequence and/or 
encoded amino acid level greater than about 60% homology with 
the coding secjuence and/or the amino acid sequence shown in 
Figure 1 or Figure 2, greater than about 70% homology, greater 
than about 80% homology, greater than about 90% homology or 
greater than about 95% homology. Determination of homology is 
discussed elsewhere herein. 

A polypeptide which is a variant, allele, derivative or mutant 
may have an amino acid sequence which differs from that given 
in a figure herein by one or more of addition, substitution, 
deletion and insertion of one or more amino acids . Preferred 
such polypeptides have wild- type function, that is to say have 
one or more of the following properties: immunological cross- 
reactivity with an antibody reactive the polypeptide for which 
the sequence is given in Figure 1 or Figure 2; sharing an 
epitope with the polypeptide for which the amino acid sequence 
is shown in Figure 1 or Figure 2 (as deteannined for example by 
iraiminological cross-reactivity between the two polypeptides) ; 
a biological activity which is inhibited by an antibody raised 
against the polypeptide whose sequence is shown in Figure 1 or 
Figure 2; ability to bind muscle membrane, ability to bind 
actin; ability to bind DPC. 
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Variations in amino acid sequence include "conservative 
variation", i.e. substitution of one hydrophobic residue such 
as isoleucine, valine, leucine or methionine for another, or 
the substitution of one polar residue for another, such as 
arginine for lysine, glutamic for aspartic acid, or glutamine 
for asparagine. Particular amino acid sequence variants may 
differ from that shown in Figure 1 or Figure 2 by insertion, 
addition, substitution or deletion of 1 amino acid, 2, 3, 4, 
or 5-10 amino acids. 

According to one aspect of the present invention there is 
provided a nucleic acid molecule comprising a sequence of 
nucleotides encoding a polypeptide with utrophin function. 
Utrophin nucleotide sequences which may be included in the 
nucleic acid molecule are disclosed in WO 97/922696 which is 
incorporated herein by reference. 

See also Figure 8 and Figure 9 for disclosure of nucleic acid 
molecules and polypeptides according to the present invention, 
comprising the exon IB sequence of the invention. 

A polypeptide with utrophin function is able to bind actin and 
able to bind the dystrophin protein complex (DPC) . 

The nucleic acid molecule may be an isolate, or in an isolated 
and/or purified form, that is to say not in an environment in 
which it is found in nature, removed from its natural 
environment . It may be free from other nucleic acid 
obtainable from the same species, e.g. encoding another 
polypeptide . 

In one embodiment, nucleic acid molecule is a "mini-gene", 
i.e. the polypeptide encoded does not correspond to full- 
length utrophin but is rather shorter, a truncated version 
(Utrophin mini-genes are discussed in W097/22696) . For 
instance, part or all of the rod domain may be missing, such 
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that the polypeptide comprises an actin-binding domain and a 
DPC-binding domain but is shorter than naturally occurring 
utrophin. In a full-length utrophin gene including what are 
identified herein as exons lA and IB, the actin-binding domain 
is encoded by nucleotides 1-739, while the DPC-binding domain 
(CRCT) is encoded by nucleotides 8499-10301 (where 1 
represents the start of translation) . See also Figure 8 . The 
respective domains in the polypeptide encoded by a mini-gene 
according to the invention may comprise amino acids 
corresponding to those encoded by these nucleotides in the 
full-length coding sequence. In one embodiment, a minigene 
according to the present invention comprises or consists of 
the amino acid sequence encoded by nucleotides 1-739 and 8499- 
10301 of the A isoform of utrophin in which exon IB as 
identified herein is substituted for exons lA and 2A. The 
sequence of such a minigene can be constructed by the ordinary 
skilled person using information disclosed herein, taking into 
account the content of W097/226 96 and Tinsley et al, Nature 
(1996) 384:349. The nucleic acid sequence and predicted 
amino acid sequence encoded by a '"mini-gene' according to the 
present invention are shown in Figure 9 . 

Advantages of a mini-gene over a sequence encoding a full- 
length utrophin molecule or derivative thereof include easier 
manipulation and inclusion in vectors, such as adenoviral and 
retroviral vectors for delivery and expression. 

A further preferred non-naturally occurring nucleic acid 
molecule encoding a polypeptide with the specified 
characteristics is a chimaeric construct wherein the encoding 
sequence comprises a sequence obtainable from one mammal, 
preferably human ("a human sequence"), and a sequence 
obtainable from another mammal, preferably mouse ("a mouse 
sequence") . Such a chimaeric construct may of course comprise 
the addition, insertion, substitution and/or deletion of one 
or more nucleotides with respect to the parent mammalian 
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sequences from which it is derived. Preferably, the part of 
the coding sequence which encodes the actin-binding domain 
comprises a sequence of nucleotides obtainable from the mouse, 
or other non-human mammal, or a sequence of nucleotides 
derived from a sequence obtainable from the mouse, or other 
non - human mamma 1 . 

In a preferred embodiment, the sequence of nucleotides 
encoding the polypeptide comprises sequence GAGGCAC at 
residues 331-337 and/or the sequence GATTGTGGATGT^AAACAGTGGG at 
residues 1453-1475 (using the conventional numbering from the 
initiation codon ATG) , and a sequence obtainable from a human. 

Nucleic acid according to the present invention is obtainable 
using one or more oligonucleotide probes or primers designed 
to hybridise with one or more fragments of a nucleic acid 
sequence shown in Figure 1 or Figure 2 particularly fragments 
of relatively rare sequence, based on codon usage or 
statistical analysis. The amino acid sequence information 
provided may be used in design of degenerate probes /primers or 
"long" probes. A primer designed to hybridise with a fragment 
of the nucleic acid sequence shown may be used in conjunction 
with one or more oligonucleotides designed to hybridise to a 
sequence in a cloning vector within which target nucleic acid 
has been cloned, or in so-called "RACE" (rapid amplification 
of cDNA ends) in which cDNA's in a library are ligated to an 
oligonucleotide linker and PGR is performed using a primer 
which hybridises with the sequence shown in the figures and a 
primer which hybridises to the oligonucleotide linker. 

Nucleic acid isolated and/or purified from one or more cells 
(e.g. human, mouse) or a nucleic acid library derived from 
nucleic acid isolated and/or purified from cells (e.g. a cDNA 
library derived from mRNA isolated from the cells) , may be 
probed under conditions for selective hybridisation and/or 
subjected to a specific nucleic acid amplification reaction 
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such as the polymerase chain reaction (PGR) . 

A method may include hybridisation of one or more (e.g. two) 
probes or primers to target nucleic acid. Where the nucleic 
acid is double- stranded DNA, hybridisation will generally be 
preceded by denaturation to produce single- stranded DNA. The 
hybridisation may be as part of a PGR procedure, or as part of 
a probing procedure not involving PGR. An example procedure 
would be a combination of PGR and low stringency 
hybridisation. A screening procedure, chosen from the many 
available to those skilled in the art, is used to identify 
successful hybridisation events and isolated hybridised 
nucleic acid. 

Probing may employ the standard Southern blotting technique. 
For instance DNA may be extracted from cells and digested with 
different restriction enzymes. Restriction fragments may then 
be separated by electrophoresis on an agarose gel, before 
denaturation and transfer to a nitrocellulose filter. 
Labelled probe may be hybridised to the DNA fragments on the 
filter and binding determined. DNA for probing may be 
prepared from RNA preparations from cells . 

Preliminary experiments may be performed by hybridising under 
low stringency conditions various probes to Southern blots of 
DNA digested with restriction enzymes. Suitable conditions 
would be achieved when a large number of hybridising fragments 
were obtained while the background hybridisation was low. 
Using these conditions nucleic acid libraries, e.g. cDNA 
libraries representative of expressed sequences, may be 
searched. 

It may be necessary for one or more gene fragments to be 
ligated to generate a full-length coding sequence. Also, 
where a full-length encoding nucleic acid molecule has not 
been obtained, a smaller molecule representing part of the 
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full molecule, may be used to obtain full-length, clones. 
Inserts may be prepared from partial cDNA clones and used to 
screen cDNA libraries . 

Those skilled in the art are well able to employ suitable 
conditions of the desired stringency for selective 
hybridisation, taking into account factors such as 
oligonucleotide length and base composition, temperature and 
so on. Exemplary conditions have been discussed already 
above , 



10 Nucleic acid according to the present invention may form part 
of a cloning vector and/or a vector from which the encoded 
polypeptide may be expressed. Polypeptide expression is 
discussed below. Suitable vectors can be chosen or 
constructed, containing appropriate and appropriately 

15 positioned regulatory sequences, as discussed elsewhere 
herein. 

A further aspect of the present invention provides a 
polypeptide which comprises the amino acid sequence shown in 
Figure 1 or Figure 2 . As mentioned earlier such a polypeptide 
20 may include other utrophin sequences or may include 
heterologous sequences . 

Polypeptides which are amino acid sequence variants, alleles, 
derivatives or mutants are also provided by the present 
invention. Such polypeptides are discussed elsewhere herein. 

25 The skilled person can use the techniques described herein and 
others well known in the art to produce large amounts of 
peptides, for instance by expression from encoding nucleic 
acid. 

In a further aspect the invention provides a method of making 
30 a polypeptide, the method including expression from nucleic 
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acid encoding the polypeptide (generally nucleic acid 
according to the invention) . This may be conveniently t>e 
achieved by growing in culture a host cell containing such a 
vector, under suitable conditions which cause or allow 
expression of the polypeptide. Polypeptides may also be 
expressed in in vitro systems such as reticulocyte lysate. 

Systems for cloning and expression of a polypeptide in a 
variety of different host cells are well known. Suitable host 
cells include bacteria, mammalian cells, yeast and baculovirus 
systems . Mammalian cell lines available in the art for 
expression of a heterologous polypeptide include Chinese 
hamster ovary cells, HeLa cells, baby hamster kidney cells and 
many others. A common, preferred bacterial host is E. coli . 

Thus, a further aspect of the present invention provides a 
host cell containing heterologous nucleic acid encoding a 
polypeptide as disclosed herein. 

The nucleic acid may be integrated into the genome (e.g. 
chromosome) of the host cell or may be on an extra- chromosomal 
vector within the cell, or otherwise identifiably heterologous 
or foreign to the cell . 

A still further aspect provides a method comprising 
introducing such nucleic acid into a host cell. Suitable 
techniques are discussed elsewhere herein. 

The introduction may be followed by causing or allowing 
expression from the nucleic acid, e.g. by culturing host cells 
under conditions for expression of the gene. 

The polypeptide encoded by the nucleic acid may be expressed 
from the nucleic acid in vlzro, e.g. in a cell-free system or 
in cultured cells, or in vivo. 

If the polypeptide is expressed coupled to an appropriate 
signal leader peptide it may be secreted from the cell into 
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the culture medium. 

Peptides can also be generated wholly or partly by chemical 
synthesis . The compounds of the present invention can be 
readily prepared according to well-estcdDlished, standard 
liquid or, preferably, solid-phase peptide synthesis methods, 
general descriptions of which are broadly available (see, for 
example, in J.M. Stewart and J.D. Young, Solid Phase Peptide 
Synthesis, 2nd edition. Pierce C3iemical Company, Rockford, 
Illinois (1984) , in M. Bodanzsky and A. Bodanzsky, The 
Practice of Peptide Synthesis, Springer Verlag, New York 
(1984) ; and Applied Biosystems 430A Users Manual, ABI Inc., 
Foster City, California) , or they may be prepared in solution, 
by the liquid phase method or by any combination of solid- 
phase, liquid phase and solution chemistry, e.g. by first 
completing the respective peptide portion and then, if desired 
and appropriate, after removal of any protecting groups being 
present, by introduction of the residue X by reaction of the 
respective carbonic or sulfonic acid or a reactive derivative 
thereof . 

The present invention also includes' active portions, 
fragments, derivatives and functional mimetics of the 
polypeptides of the invention. An "active portion" of a 
polypeptide means a peptide which is less than said full 
length polypeptide, but which retains a biological activity, 
such as a biological activity selected from binding to ligand, 
binding to muscle membrane. Such an active fragment may be 
included as part of a fusion protein, e.g. including a 
polypeptide which is to be targetted to the muscle membrane. 

A "fragment" of a polypeptide generally means a stretch of 
amino acid residues of about five to twenty-five contiguous 
amino acids, typically about ten to twenty contiguous amino 
acids. Fragments of the novel N- terminus polypeptide sequence 
may include antigenic determinants or epitopes useful for 
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raising antibodies to a portion of the amino acid sequence, or 
may be sequence useful for targetting to muscle membrane . 
Alanine scans are commonly used to find and refine peptide 
motifs within polypeptides, this involving the systematic 
replacement of each residue in turn with the amino acid 
alanine, followed by an assessment of biological activity. . 

Preferred fragments of exon IB polypeptide include those 
comprising or consisting of an epitope which may be used for 
instance in raising or isolating antibodies. Variant and 
derivative peptides, peptides which have an amino acid 
sequence which differs from one of these sequences by way of 
addition, insertion, deletion or substitution of one or more 
amino acids are also provided by the present invention. 

A "derivative" of a polypeptide or a fragment thereof may 
include a polypeptide modified by varying the amino acid 
sequence of the protein, e.g. by manipulation of the nucleic 
acid encoding the protein or by altering the protein itself. 
Such derivatives of the natural amino acid sequence may 
involve one or more of insertion, addition, deletion or 
substitution of one or more amino acids, which may be without 
fundamentally altering the qualitative nature of biological 
activity of the wild type polypeptide. Also encompassed 
within the scope of the present invention are functional 
mimetics of active fragments of the exon IB polypeptides 
provided (including alleles, mutants, derivatives and 
variants) . The term "functional mimetic" means a substance 
which may not contain an active portion of the relevant amino 
acid sequence, and probably is not a peptide at all, but which 
retains in qualitative terms biological activity of natural 
exon IB polypeptide. The design and screening of candidate 
mimetics is described in detail below. 

A polypeptide according to the present invention may be 
isolated and/or purified (e.g. using an antibody) for instance 
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after production by expression from encoding nucleic acid (for 
which see below) . Thus, a polypeptide may be provided free or 
substantially free from contaminants with which it is 
naturally associated (if it is a naturally- occurring 
polypeptide) . A polypeptide may be provided free or 
substantially free of other polypeptides. Polypeptides 
according to the present invention may be generated wholly or 
partly by chemical synthesis . The isolated and/or purified 
polypeptide may be used in formulation of a composition, which 
may include at least one additional component, for example a 
pharmaceutical composition including a pharmaceutically 
acceptable excipient, vehicle or carrier. A composition 
including a polypeptide according to the invention may be used 
in prophylactic and/or therapeutic treatment as discussed 
below. 

A polypeptide, peptide, allele, mutant, derivative or variant 
according to the present invention may be used as an iramunogen 
or otherwise in obtaining specific antibodies . Antibodies are 
useful in purification and other manipulation of polypeptides 
and peptides, diagnostic screening and therapeutic contexts. 

Accordingly, a further aspect of the present invention 
provides an antibody able to bind specifically to the 
polypeptide whose sequence is given in Figure 1 or Figure 2 . 
Such an antibody may be specific in the sense of being able to 
distinguish between the polypeptide it is able to bind and 
other human (or mouse) polypeptides for which it has no or 
substantially no binding affinity (e.g. a binding affinity of 
about lOOOx less) . Specific antibodies bind an epitope on the 
molecule which is either not present or is not accessible on 
other molecules . Antibodies according to the present 
invention may be specific for the wild-type polypeptide. 
Antibodies according to the invention may be specific for a 
particular mutant, variant, allele or derivative polypeptide 
as between that molecule and the wild-type polypeptide, so as 
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to be useful in diagnostic and prognostic methods as discussed 
below. Antibodies are also useful in purifying the 
polypeptide or polypeptides to which they bind, e.g. following 
production by recombinant expression from encoding nucleic 
acid. 

Preferred antibodies according to the invention are isolated, 
in the sense of being free from contaminants such as 
antibodies able to bind other polypeptides and/or free of 
searum components . Monoclonal antibodies are preferred for 
some purposes, though polyclonal antibodies are within the 
scope of the present invention. 

Antibodies may be obtained using techniques which are standard 
in the art . Methods of producing antibodies include 
immunising a mammal (e.g. mouse, rat, rabbit, horse, goat, 
sheep or monkey) with the protein or a fragment thereof . 
Antibodies may be obtained from immunised animals using any of 
a variety of techniques known in the art, and screened, 
preferably using binding of antibody to antigen of interest. 
For instance, Western blotting techniques or 
immunoprecipitation may be used (Armitage et al . , 1992, 
Nature 357: 80-82) . Isolation of antibodies and/or antibody- 
producing cells from an animal may be accompanied by a step of 
sacrificing the animal. 

As an alternative or supplement to immunising a mammal with a 
peptide, an antibody specific for a protein may be obtained 
from a recombinantly produced library of expressed 
immunoglobulin variable domains, e.g. using lambda 
bacteriophage or filamentous bacteriophage which display 
functional immunoglobulin binding domains on their surfaces ; 
for instance see WO92/01047. The library may be naive, that 
is constructed from sec[uences obtained from an organism which 
has not been immunised with any of the proteins (or 
fragments) , or may be one constructed using sequences obtained 
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from an organism which has been exposed to the antigen of 
interest . 

Antibodies according to the present invention may be modified 
in a number of ways. Indeed the term "antibody" should be 
construed as covering any binding substance having a binding 
domain with the required specificity. Thus the invention 
covers antibody fragments, derivatives, functional equivalents 
and homologues of antibodies, including synthetic molecules 
and molecules whose shape mimicks that of an antibody enabling 
it to bind an antigen or epitope. 

Example antibody fragments, capable of binding an antigen or 
other binding partner are the Fab fragment consisting of the 
VL, VH, CI and CHI domains; the Fd fragment consisting of the 
VH and CHI domains; the Fv fragment consisting of the VL and 
VH domains of a single arm of an antibody; the dAb fragment 
which consists of a VH domain; isolated CDR regions and 
F(ab')2 fragments, a bivalent fragment including two Fab 
fragments linked by a disulphide bridge at the hinge region. 
Single chain Fv fragments are also included. 

A hybridoma producing a monoclonal antibody according to the 
present invention may be subject to genetic mutation or other 
changes. It will further be understood by those skilled in 
the art that a monoclonal antibody can be subjected to the 
techniques of recombinant DNA technology to produce other 
antibodies or chimeric molecules which retain the specificity 
of the original antibody. Such techniques may involve 
introducing DNA encoding the immunoglobulin variable region, 
or the complementarity determining regions (CDRs) , of an 
antibody to the constant regions, or constant regions plus 
framework regions, of a different immunoglobulin. See, for 
instance, EP184187A, GB 2188638A or EP-A-0239400 . Cloning and 
expression of chimeric antibodies are described in EP-A- 
0120694 and EP-A-0125023 , 
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Hybridomas capable of producing antibody with desired binding 
characteristics are within the scope of the present 
invention, as are host cells, eukaryotic or prokaryotic, 
containing nucleic acid encoding antibodies (including 
5 antibody fragments) and capable of their expression. The 
invention also provides methods of production of the 
antibodies including growing a cell capable of producing the 
antibody under conditions in which the antibody is produced, 
and preferably secreted. 

10 The reactivities of antibodies on a sample may be detearroined 
by any appropriate means . Tagging with individual reporter 
molecules is one possibility. The reporter molecules may 
directly or indirectly generate detectable, and preferably 
measurable, signals. The linkage of reporter molecules may be 

15 directly or indirectly, covalently, e.g. via a peptide bond or 
non-covalently . Linkage via a peptide bond may be as a result 
of recombinant expression of a gene fusion encoding antibody 
and reporter molecule. 

One favoured mode is by covalent linkage of each antibody with 
20 an individual f luorochrome , phosphor or laser dye with 

spectrally isolated absorption or emission characteristics. 
Suitable f luorochromes include fluorescein, rhodamine, 
phycoerythrin and Texas Red. Suitable chromogenic dyes 
include diaminobenzidine , 

25 Other reporters include macromolecular colloidal particles or 
particulate material such as latex beads that are coloured, 
magnetic or paramagnetic, and biologically or chemically 
active agents that can directly or indirectly cause detectable 
signals to be visually observed, electronically detected or 

30 otheirwise recorded. These molecules may be enzymes which 
catalyse reactions that develop or change colours or cause 
changes in electrical properties, for example. They may be 
molecularly excitable, such that electronic transitions 
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between energy states result in characteristic spectral 
absorptions or emissions. They may include chemical entities 
used in conjunction with biosensors. Biotin/avidin or 
biotin/streptavidin and alkaline phosphatase detection systems 
may be employed. 

The mode of determining binding is not a feature of the 
present invention and those skilled in the art are able to 
choose a suitable mode according to their preference and 
general Icnowledge. Particular embodiments of antibodies 
according to the present invention include antibodies able to 
bind and/or which bind specifically, e.g. with an affinity of 
at least 10"^ M, to the peptides shown in Figure 1 or Figure 2. 

Antibodies according to the present invention may be used in 
screening for the presence of a polypeptide, for example in a 
test sample containing cells or cell lysate as discussed, and 
may be used in purifying and/or isolating a polypeptide 
according to the present invention, for instance following 
production of the polypeptide by expression from encoding 
nucleic acid therefor. 

An antibody may be provided in a kit, which may include 
instructions for use of the antibody, e.g. in determining the 
presence of a particular substance in a test sample . One or 
more other reagents may be included, such as labelling 
molecules, buffer solutions, elutants and so on. Reagents may 
be provided within containers which protect them from the 
external environment , such as a sealed vial . 

The present invention extends in various aspects not only to a 
substance identified using a nucleic acid molecule as a 
modulator of utrophin promoter activity, or to a polypeptide, 
or nucleic acid molecule in accordance with what is disclosed 
herein, but also a pharmaceutical composition, medicament, 
drug or other composition comprising such a substance, a 
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method comprising administration of such a composition to a 
patient, e.g. for increasing utrophin expression for instance 
in treatment of muscular dystrophy, use of such a siibstance in 
manufacture of a composition for administration, e.g. for 
increasing utrophin expression for instance in treatment of 
muscular dystrophy, and a method of making a pharmaceutical 
composition comprising admixing such a substance with a 
pharmaceutically acceptable excipient, vehicle or carrier, and 
optionally other ingredients. 

Administration will preferably be in a "therapeutically 
effective amount", this being sufficient to show benefit to a 
patient. Such benefit may be at least amelioration of at 
least one symptom. The actual amount administered, and rate 
and time-course of administration, will depend on the nature 
and severity of what is being treated. Prescription of 
treatment, eg decisions on dosage etc, is within the 
responsibility of general practitioners and other medical 
doctors . 

A composition may be administered alone or in combination with 
other treatments, either simultaneously or sequentially 
dependent upon the condition to be treated. 

Pharmaceutical compositions according to the present 
invention, and for use in accordance with the present 
invention, may comprise, in addition to active ingredient, a 
pharmaceutically acceptable excipient, carrier, buffer, 
stabiliser or other materials well known to those skilled in 
the art. Such materials should be non-toxic and should not 
interfere with the efficacy of the active ingredient . The 
precise nature of the carrier or other material will depend on 
the route of administration, which may be oral, or by 
injection, e.g. cutaneous, subcutaneous or intravenous. 



Pharmaceutical compositions for oral administration may be in 
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tablet, capsule, powder or liquid form. A tablet may comprise 
a solid carrier such as gelatin or an adjuvant. Liquid 
pharmaceutical compositions generally comprise a liquid 
carrier such as water, petroleum, animal or vegetable oils, 
mineral oil or synthetic oil. Physiological saline solution, 
dextrose or other saccharide solution or glycols such as 
ethylene glycol, propylene glycol or polyethylene glycol may 
be included. 

For intravenous, cutaneous or subcutaneous injection, or 
injection at the site of affliction, the active ingredient 
will be in the form of a parenterally acceptable aqueous 
solution which is pyrogen- free and has suitable pH, 
isotonicity and stability. Those of relevant skill in the art 
are well able to prepare suitable solutions using, for 
example, isotonic vehicles such as Sodium Chloride Injection, 
Ringer's Injection, Lactated Ringer's Injection. 
Preservatives, stabilisers, buffers, antioxidants and/or other 
additives may be included, as required. 

Instead of a substance identified using a promoter as 
disclosed herein, a mimetic or mimick or the substance may be 
designed for pharmaceutical use. The designing of mimetics to 
a known pharmaceutically active compound is a known approach 
to the development of phaarmaceuticals based on a "lead" 
compound. This might be desirable where the active compound 
is difficult or expensive to synthesise or where it is 
unsuitable for a particular method of administration, eg 
peptides are unsuitable active agents for oral compositions as 
they tend to be quickly degraded by proteases in the 
alimentary canal. Mimetic design, synthesis and testing may 
be used to avoid randomly screening large number of molecules 
for a target property. 

There are several steps commonly taken in the design of a 
mimetic from a compound having a given target property. 
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Firstly, the particular parts of the compound that are 
critical and/or irnportant in determining the target property- 
are determined. In the case of a peptide, this can be done by 
systematically varying the amino acid residues in the peptide, 
eg by substituting each residue in turn. These parts or 
residues constituting the active region of the compound are 
Icnown as its "pharmacophore" . 

Once the pharmacophore has been found, its structure is 
modelled to according its physical properties, eg 
stereochemistry, bonding, size and/or charge, using data from 
a range of sources, eg spectroscopic techniques, X-ray 
diffraction data and NMR. Computational analysis, similarity 
mapping {which models the charge and/or volume of a 
pharmacophore, rather than the bonding between atoms) and 
other techniques can be used in this modelling process. 
In a variant of this approach, the three-dimensional structure 
of the ligand and its binding partner are modelled. This can 
be especially useful where the ligand and/or binding partner 
change conformation on binding, allowing the model to take 
account of this the design of the mimetic. 

A template molecule is then selected onto which chemical 
groups which mimic the pharmacophore can be grafted. The 
template molecule and the chemical groups grafted on to it can 
conveniently be selected so that the mimetic is easy to 
synthesise, is likely to be pharmacologically acceptable, and 
does not degrade in vivo, while retaining the biological 
activity of the lead compound. The mimetic or mimetics found 
by this approach can then be screened to see whether they have 
the target property, or to what extent they exhibit it. 
Further optimisation or modification can then be carried out 
to arrive at one or more final mimetics for in vivo or 
clinical testing. 

Mimetics of substances identified as having ability to 
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modulate utrophin promoter activity using a screening method 
as disclosed herein are included within the scope of the 
present invention. 

Modifications to and further aspects and embodiments of the 
present invention will be apparent to those skilled in the 
art . All documents mentioned herein are incorporated by 
reference. 

Experimental basis for and embodiments of the present 
invention will now be described in more detail, by way of 
example and not limitation, and with reference to the 
following figures : 

Figure 1 shows the sequence of the human exon IB and promoter 
B. Numbering corresponds to the insert of pBSX2 . 0 . The deduced 
translation of exon IB is shown. The positions of features 
such as restriction sites, IL-6 response element and Alu 
repetitive elements are shown. 

Figure 2 shows the sequence of the mouse exon IB and promoter 
B. Numbering corresponds to the insert of pBSX8 . 0 . The deduced 
translation of exon IB is shown. The positions of features 
such as restriction sites, Itj-6 response element and Alu 
repetitive elements are shown. 

Figure 3 shows the sequence alignment of human (top) and mouse 
(bottom) exon IB (in upper case) and promoter B. Numbering 
corresponds to the inserts of pBSX2 . 0 and pBSX8 . 0 , 
respectively. The human PvuII site (see Figure 7) is 
indicated. The open triangle indicates the position at which 
the luciferase coding sequence was inserted to make 
pGIj3/UtroB/F (see below) . The deduced translation of exon IB 
is shown; amino acids marked in bold type are identical 
between the human and mouse sequences . The conserved splice 
donor consensus is shown in grey. Two putative Apl sites and 
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an initiator- like element (Inr) are 100% conserved and. 
indicated in black. A solid arrow marks the single 
transcription start indicated by primer extension; figures 
adjacent to the sequence indicate the number of individual 
5 'RACE clones that terminated at the positions shown. 

Figure 4 shows the position of the primers used in RT-PCR of 
exon IB-containing utrophin transcript, and the probes used to 
probe the PGR products. Primers specific to exon IB {BF31) and 
utrophin C- terminus (CT2) were used to amplify 9816bp of 
utrophin cDNA. The products were blotted and probed with U41, 
U107, BR4 and U16 as indicated. The diagram is not to scaled- 
numbering refers to the nucleotide sequence of the full-length 
cDNA. The corresponding functional domains of the protein are 
indicated above: act in binding domain; rod, rod domain; Cys, 
cysteine rich domain, C-Term; C-terminal domain . 

Figure 5 shows a schematic representation of (A) human YAC 
and (B) mouse PAC contigs showing position of exons within the 
genomic map. Key to mouse restriction sites: C, Clal; S, 
SacII; B, BssHII; X, Xhol . (C) shows the nomenclature for 
utrophin promoters, exons and transcripts. 

Figure 6 shows the in vitro activity of utrophin promoter B. 
(A) shows normalised luciferase activity following 
transfection of three different human cell types with either 
pGL3/utroB/F ("forward construct') or pGLi3/utroB/R ("reverse 
construct • ) . 

Figure 7 shows deletion analysis of promoter B. The l.Skb 
insert of pGL3/utroB/F was deleted at its 5' and 3' ends using 
the internal restriction sites indicated. Reporter activity 
was assayed following transient transfection of IN157 and 
CIillT47 cells. 

Figure 8 shows conceptual translation of exon IB as part of 
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utrophin, showing a nucleotide sequence and encoded 
polypeptide according to embodiments of the present invention. 

Figure 9 shows the nucleic acid and predicted amino acid 
sequence of a utrophin B isof orm "rainigene ' . 



5 Figure 10 shows the dosage dependence of IL-6 mediated 
expression from the is of orm B promoter. 

Oligonucleotides, PGR, RT-PCR and 5 'RACE 

PGR and RT-PCR were performed as described (Blake, et al . 
(1996) J Biol Chem 271, 7802-7810) . Oligonucleotide sequences 
10 (5 • to 3 • ) were: 



UM83 


gatgttcctg 


tgaggccttc 


gag. 


UM82 


cactcttgga 


aaatcgagcg 


t. 


U16 


actatgatgt 


ctgccagagt 


tg. 


U107 


gatccaatag 


cttccttcca 


tcttt , 


XJBF 


tggaaaaagt 


ggaggttgga. 




BR2 


tccaacctcc 


actttttcca. 




BR4 


gcctggagag 


ctacatgccc 


t. 


BF8 


ctccacatct 


ttttcctcat 


catct , 


BF9 


gattgtggtg 


atggttgtag 


aa. 


BRIO 


gattgtggtg 


atggttgtag 


aa, 


BR14 


gatgatgagg 


aaaaagatgt 


ggag, 


BF15 


aaacccaaaa 


taacacagga 


catc. 


BF16 


agtgtaactt 


ctctctggtg. 




BF31 


taagcagatg 


taggtgatga 


gc. 


BF42 


gctgcttttg 


ttgtccactt 


c. 


BR43 


atagcttcct 


tccatctttg 


ag. 


CT2 


ctccacgttc 


ttccctctct 


act. 


2ApF 


gcgtgcagtg 


gaccattttt 


cagattta , 


IBpF 


cgctgcagca 


gccaccacat 


ttcgttg. 


3pR 


gcgtgcagat 


cgagcgttta 


tccatttg . 



5' RACE was undertaken using adapter- ligated mouse heart cDNA 
(Marathon-Ready, Clontech) , following the manufacturer's 
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protocol, using the supplied adapter primers with nested mouse 
utrophin primers UM83 (exon 4) and UM82 (exon 3) . Products 
were cloned in pGEM-T (Promega) , Human exon IB was isolated 
from skeletal muscle cDNA by PGR using mouse primers UBF and 
UM83 . 5 'RACE was used to clone the 5' end of human exon IB, 
using primers U107 and BR4 , Full-length utrophin RT-PCR was 
done as described (Blake, et al . (1996) J Biol Chem 271, 7802- 
7 810.), but using Boehringer Expand Reverse Transcriptase and 
Long Template PGR reagents, and a primer annealing temperature 
of 59**G. Semi-quantitative RT-PGR was performed using primers 
BF42 and BR43 to amplify utrophin B, and commercial primers 
(Stratagene) to amplify glyceraldehyde- 3 -phosphate 
dehydrogenase (GAPDH) . Exponential amplification was 
established by withdrawing samples from thermal cycling at 1 
cycle intervals over a range of 5 cycles, predicted to span 
the exponential range following initial experiments in which 
samples were withdrawn at 5 cycle intervals . Products were 
blotted and probed with labelled BR4 or a 600bp GA3PH probe. 
Band intensities were quantified using a Storm phosphoimager . 
A graph of logz [band intensity] versus cycle number showed a 
linear relationship with gradient = 1, indicating near-perfect 
exponential amplification. The band intensities at any given 
cycle over this range are therefore directly proportional to 
the amount of cDNA in the original samples . 

Genomic Mapping and Clones 

Human YACs are as previously described (Pearce, et al . (1993) 
Hum Mol Genet 2, 1765-72) . Southern blots of restriction 
digested YAC DNA were probed with end-labelled BR4 . A 3.0kb 
hybridising Xbal fragment was cloned from YAC 4X124H10 (a YAG 
clone which contains a human genomic DNA insert) into 
pBlueScript (Stratagene) generating pBSX2 . 0 . Mouse PACs were 
identified from the RPGI21 library. A 398bp exon IB/promoter B 
DNA probe (UB400) encompassing human positions 112 9 to 1527 
was used for exon IB mapping. Library filters were screened 
with probes to exons lA-5 (Dennis, et al. (1996) Nucleic Acid 



wo 01/25461 



PCT/GBOO/03800 



38 

Res 24, 1646-52) and UB400. Eleven PACs were identified, and 
four of these arranged into a contig by restriction mapping. 
An B.Okb Xbal fragment from PAC 110C24, that hybridised with 
UB400, was cloned in pBlueScript generating pBSXB . 0 . 

Northern Blots and Probes 

A human multiple tissue northern blot and b-actin control cDNA 
probe were obtained from Clontech. A utrophin C- terminal cDNA 
probe, encompassing the last 4 . Okb of the utrophin message, 
was generated by PGR. Human exon IB sequence between positions 
1480 and 1596 was cloned into pGEM-T and an exon IB antisense 
riboprobe was transcribed (In Vitro Transcription Kit, 
Promega) from the SP6 promoter following linearisation of the 
plasmid with Ncol . Hybridisation was carried out at 70°C in 
50% formamide hybridisation buffer (Ausubel, et al . (1999) 
Current Protocols in Molecular Biology (Wiley) . ) and the 
filter was washed at 75*>C in O.lxSSC, 0.1%SDS for 2 hours. 

RNase Protection 

Specific probes spanning the exon lB/3 and exon 2A/3 
boundaries were obtained by PGR amplification of mouse heart 
cDNA using primers 2ApF, IBpF and 3pR- Products were cloned 
in the PstI site of pDP18 (Ambion) and sequenced. Plasmids 
were linearised with EcoRl (IB) or BamHl (2A) ; labelled 
antisense riboprobe was transcribed from the T7 promoter and 
gel purified. RNase protection was carried out using RPAIII 
kit (Ambion) following the manufacturer's instructions (30Aig 
total RNA unless stated, hybridisation temperature 42 "C, RNase 
A/Tl dilution 1:200). Following electrophoretic separation, 
band intensities were quantified as above, and corrected for 
the amount of label present in each protected fragment . 

Promoter /Reporter Constructs 

Reporter constructs were generated by PGR amplification of the 
human sequence between positions 39 and 1503, using pBSX2 . 0 as 
template . Pf u polymerase was used with primers BF9 and BR14 . 
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Following 15 cycles of 96 °C for 45 seconds, 62 °C for 45 
seconds, 72 °C for 4 minutes, products were dA- tailed and 
cloned in pGEM-T. Clones were identified with product in both 
orientations and insert, liberated by digestion with 
5 Sacl/Ncol, was cloned into the Sacl/Ncol sites of a 
promoterless lucif erase reporter plasmid (pGL3 basic, 
Promega) , generating constructs with insert in forward 
(pGL3/utroB/F) and reverse (pGL3/UtroB/R) orientation with 
respect to the coding sequence of lucif erase. Deletions of the 
10 forward construct were generated by cleavage at Spel, Ndel, 

EcoRI and PvuII sites in the insert, followed by religation to 
sites in the 5' or 3 • polylinker. Constructs were sequenced 
completely . 

Cell Culture and Transf ections 

15 Three human cell lines (IN157 rhabdomyosarcoma (Nielsen et 
al . , 1993, Mol Cell Endocrinol 93: 87-95), CL11T47 kidney 
epithelial and HeLa cervical epithelial (Cancer Research, 1952 
12: 264) were maintained as described (Dennis, et a.1 . (1996) 
Nucleic Acid Res 24, 1646-52) . 2fig pGL3/utroB/F or R, or its 

20 molar equivalent, mixed with O.Bfig of LacZ control plasmid 
(pSV-p-gal, Promega) was transf ected in each well of 6 well 
plates using Superfect (Qiagen) , following the manufacturer's 
protocol. 48 hours later, cells were harvested and cell 
extracts were assayed for lucif erase and P -galactosidase 

25 activity as described (Dennis, et al . (1996) Nucleic Acids Res 
24, 1646-52) . Lucif erase activity was standardised to P- 
galactosidase activity in each individual sample to control 
for transf ection efficiency. Results are expressed as mean 
lucif erase/ p-galactosidase ratio for four individual 

30 transf ections . Error bars indicate the standard error of the 
mean. For comparison of different constructs within the same 
cell line, results were standardised to those obtained with 
pGL3/utroB/F and are expressed as % of this value. For 
comparison of constructs between cell lines, results were 

35 standardised to those obtained with a lucif erase-SV40 
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promoter /enhancer plasmid (pGL3 control, Promega) that 
generates high levels of reporter activity in all cell lines 
tested. 

Primer Extension 

Primer extension was carried out as described (18) ; end- 
labelled primer BR2 was annealed to 0, 3 0 or 50 fig mouse heart 
total RNA at SS^C for 20 minutes, and extended at 42*'C for 40 
minutes. Products were separated on a 6% polyacrylamide gel, 
under denaturing conditions, alongside a sequencing ladder 
generated from pBSX8 . 0 using primer BR2 . 

Results 

An alternative 5' exon in utrophin mRNA 

Utrophin from a mouse heart cDNA library was amplified by 
5 'RACE, and the resulting products cloned and sequenced. Of 12 
clones, 8 contained novel sequence 5' of exon 3. Below, we 
present evidence that the novel sequence is a single 
alternative 5 ' exon of utrophin containing a translational 
initiation codon . We refer to this sequence as "exon IB' to 
distinguish it from the previously described 5 ' cDNA sequence 
comprising untranslated exon lA and exon 2 A which contains the 
translational start (Figure 5c) . 

Figure 3 shows a sequence comparison of human and mouse exon 
IB, and genomic flanking sequence. The position and phase of 
the splice junction at the 5 ' end of exon 3 is identical for 
both exon IB- and exon 2A- containing transcripts . Exon IB 
contains a putative ATG translation initiation codon and open 
reading frame, in- frame with that of exon 3, predicting a 
novel 31 amino acid N- terminus to the utrophin protein. The 
context of the ATG codon is predicted to be favourable for 
translation in that there is a purine at position -3 (bold in 
Figure. 3) (33) . Human and mouse exons IB show 82% nucleotide 
identity. The predicted translations are 84% identical and 94% 
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similar. The position and context of the ATG codon are 
conserved. The human sequence contains a second putative ATG 
codon immediately 5' (position 1511, solid bar in Figure .1), 
followed by a TAG stop codon. As this ATG does not adhere to 
the Kozak consensus, is not associated with an open reading 
frame and is not present in the mouse sequence, we predict 
that this is not a functional translation start. A similar 
feature is present in human exon 2A, where the 5'XJTR contains 
a short open reading frame prior to the true translation 
start . 

The transcz-ipt associated with exon IB 

A human multiple tissue northern blot was probed with an exon 
IB anti-sense riboprobe. A single hybridising 131cb band was 
observed, identical to that produced by probing the same blot 
with a cDNA encompassing 4kb of the utrophin C-terminus, 
indicating that exonlB is exclusively associated with a full- 
length utrophin mRNA. Exon IB is ubiquitously expressed, and 
appears most abundant in heart and pancreas, and least 
abundant in the brain, relative to p-actin. This is similar 
to the expression profile of total full-length utrophin. 

RT-PCR was employed to confirm the association of exon IB with 
a utrophin raRNA predicted to give rise to functional protein 
(Figure. 4) . Amplification of first strand cDNA from IN157 
cells utilising a forward primer specific to exon IB (BF3I)and 
a reverse primer within the utrophin C-terminus ( CT2 ) produced 
a product of expected size. Successive hybridisation of this 
PGR product with domain- specif ic probes; U41, UBR4, U107 and 
U16, confirmed that exon IB is associated with a utrophin 
transcript spanning the full coding sequence of the gene. 

The expression profiles of exons IB and 2A were examined using 
RNase protection. Specific riboprobes corresponding to the 
exon lB/3 and 2A/3 boundaries were simultaneously hybridised 
with total RNA, allowing direct cjuantitation of transcript 



wo 01/25461 



PCT/GBOO/03800 



42 

abundance. B-utrophin is the most abundant form in the heart, 
whereas exon 2A- containing transcripts predominate in the 
kidney. Approximately equal amounts of exons IB and 2A were 
observed in the brain and in skeletal muscle . 

Slapping and cloning of genomic sequence associated with exon 
IB 

Using probe BR4, exon IB was mapped within our previously 
described human YAC coiitig (26) encompassing the 5' end of the 
utrophin locus (Figure. 5a) . A hybridising band was seen with 
YAC 4X124H10 but not 4X23E3 or 5C2 indicating that exon IB 
lies within the 120kb intron 2 of the utrophin gene. A 
subsequent database search identified a clone from the HGMP 
human chromosome 6 sequencing project, containing exons lA, 2A 
and IB. This indicated that exon IB lies 52.2kb 3' of exon 2A 
(Figure. 5a) . Probing the mouse genomic PAC library (RPCI21 
from P. DeJong, Roswell Park Cancer Institute) with utrophin 
exons lA, IB and 2- 5 inclusive identified a series of genomic 
PACs spanning the 5 ' end of the mouse utrophin gene . Four of 
these PACs were assembled into a contig of the region . 
Hybridisation with UB400 confirmed that exon IB lies within 
intron 2 in the mouse (Figure. 5b), approximately 50kb 3' of 
exon 2 . 

Human and mouse genomic fragments were obtained from the YAC 
and PAC libraries, respectively. Genomic sequence 
encompassing exon IB was obtained by an Xba I digest of YAC 
4X124H10 (human 3kb fragment) and PAC110c24 (mouse 8 . 8kb 
fragment) . These fragments were sub- cloned into pBluescript 
vector, the human fragment was deleted to 2kb during the sub- 
cloning. The plasmid clones were designated pBSX2 . 0 (human) 
and pBSXB . 0 (mouse) . Comparison of the cDNA and genomic 
sequence showed no evidence of a further 5 ' exon in the 
transcript associated with exon IB, suggesting that the 
genomic flanking sequence contained the transcription start 
and promoter element responsible for exon IB expression. Our 
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nomenclature for utrophin 5' exons, transcripts and promoters 
appears in Figure 5c. 

Promoter B 

1.5kb of human genomic sequence 5' of exon IB, including the 
5 5'UTR of exon IB, was cloned in both orientations into a 
promoterless lucif erase reporter vector. Three human cell 
lines (IN157 rhabdomyosarcoma, CL11T47 kidney epithelial and 
HeLa cervical epithelial) were transiently transfected with 
these constructs . These three lines were chosen because they 

10 are known to express utrophin mRNA and protein at different 

levels-. Reporter activity was detected at significantly higher 
levels in cells transfected with the forward than the reverse 
orientation construct, indicating promoter activity (Figure 
6) . Interestingly, the level of activity varied between cell 

15 lines by an order of magnitude. Semi -quantitative RT-PCR 
demonstrated that the variation of luciferase expression 
mimicked the transcription profile of endogenous utrophin exon 
IB- In contrast, the GA3PDH control showed identical 
amplification in all cDNA samples, indicating that the 

20 differences seen in B-utrophin amplification have arisen from 
differences in the level of expression of the endogenous B- 
utrophin transcript in these cells lines. These data show that 
the l.Skb of genomic sequence 5' of exon IB utilised in these 
reporter clones contains the necessary signals to initiate 

25 transcription of exon IB, and regulatory elements that 
determine the level of expression in these cell lines . 

To further delineate important elements within this region, a 
series of 5' and 3' deletions of promoter B were made, and the 
in vitro activity of each one assayed (Figure 7) . A 300bp 
30 element, contained within clone pGL3/utroB/F/D5 ' Pvu 1199, 

retains 70% activity of the full l.Skb construct in expressing 
cell lines, and shows 74% identity between human and mouse 
( Figure. 3) . Homology falls to 50% when sequence further 5' if 
the human PvuII site is compared with corresponding mouse 
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sequence using a 35bp window. Homology was determined using 
GAP, from version 20 of GCG, with default parameters as noted 
already above. 

Promoter B transcription start site 

The 5 ' ends of 8 human and 4 mouse 5 • RACE clones clustered 
around a putative cap site in the genomic sequence (Figure. 3) . 
None of the 5 'RACE clones generated by amplification across 
the exon 3/exon IB boundary extended further upstream. RT-PCR 
was carried out using forward primers around this region with 
a reverse primer in exon 4 . A product of expected size was 
amplified from IN157 cDNA by primers BF42 and BF8, but not 
BF16 or BF15, indicating that the transcription start is 
within the 18bp that separates the two primers BF15 and BF42 . 
These 18 bases contain the putative cap site and the cluster 
of RACE clone 5 ' ends . 

To map the start site accurately, primer extension using an 
exon IB reverse primer and mouse heart RNA was employed. This 
yielded a single product, indicative of a single transcription 
start site. Transcription initiates at mouse position 1183 
within a 25-bp motif, which is 100% conserved between human 
and mouse. Part of this motif, spanning the cap site, is a 6/7 
base match for the initiator consensus, and correspondingly 
shows homology to the initiators of other genes . The 
transcription start site is homologous to the initiators of 
other promoters. Consensus 1, initiator consensus derived from 
sequence comparison of Inr^ genes (Azizkhan, et al . (1993) 
Critical Reviews in Eukaxyotic Gene Expression 3, 229-254.); 
consensus 2, experimentally- derived consensus for functional 
initiator (Javahery, et al . (1994) Molecular and Cellular 
Biology 14, 116-127.),- TdT, terminal deoxynucleotidyl 
transferase; hRAR, human retinoic acid receptor a; mCREB, 
mouse cAMP response element binding protein. Transcribed 
sequence is indicated in bold uppercase. We consider this 
promoter to be of the TATA"Inr^ type- 
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Assaying for substances which modulate utrophin promotezr 
activity 

Method 1 : 

This method uses a mouse J77dx-H2K myoblast line stably 
transfected with a human 7 . Okb utrophin promoter- luc if erase 
construct . On day 1 myoblast cells transfected with the 
construct are plated out in 6-well dishes, with compound or 
DMSO-only for the negative controls. 

4x6 well plates are used for every 3 compounds (the 
compounds are dissolved in DMSO and stored prior to use) . For 
example, compound A, or B, or C were each added to 1 well, 
while the remaining 3 wells contain only DMSO. This results 
in 4 wells containing each compound and 12 wells with DMSO 
alone. Due to the inherent noise of both the harvesting/assay 
and cell seeding/growth steps, this is the minimum number that 
results in meaningful analysis. Setting up the plates in this 
way means that the data really are paired, and can be analysed 
with a paired student T test . This provides a more powerful 
statistical analysis rather than putting each compound on a 
different plate and comparing it with a control plate. 

On Day 4 the cells are harvested and luciferase quantitation 
and pairwise analysis is carried out. 

Method 2: 

Compounds which up- regulate the endogenous utrophin promoter 
are be found using jndx-H2K myoblasts that are not transfected 
with the utrophin promoter- luciferase construct. Mdx- 
myoblasts can be used to mimic utrophin transcprition and 
protein stability in dystrophin- deficient cells . 

Identification of utrophin protein expression 

Quantitative Western Blotting is used to measure the level of 
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utrophin expression (Tinsley JM, et a.1.. Nature Medicine 4, 
1441-1444.) Using 6 well plates and treating with compound as 
described above generates enough total protein sample to test 
by Western blotting. Antibodies specific to the A protein or 
B protein are used to quantify levels of either protein. 

Identilzifi cation of utrophin RNA expression 

Quantitative ribonuclease protection is used to analyse levels 
of utrophin expression. A paiirwise design is used, as 
described above, but more cells are necessary. To see bands 
clearly, about 20-30Aig total RNA is used. Each compound and 
control will need a 175 cm^ tissue culture flask. A dual probe 
to simultaneously identify the A transcript and B transcript 
is be used. 

Using the two techniques described compounds are identified 
after cell treatment which modulate utrophin levels. The same 
techniques are used for in vivo animal experiments where the 
compound is administered to dystrophin deficient mdx mice. 

lnterleukln-6 (IL-6) Interactions 

Two related elements are present in the promoters of genes 
encoding acute phase proteins that mediate an increase in 
transcription stimulated by an IL-G triggered signalling 
cascade (Hocke et al . , 1992) . One of these was found to be 
present in the exon IB flanking sequence. Wild type and 
mutated reporter fusions for IL-6 were therefore tested for 
responsiveness in appropriate cell systems . 

Constructs of the 1.5F B promoter normal and mutant (consensus 
change : ctggaa > gatatc^ concerning the mutant : Hattori M et 
al (l990)Pr-oc. Natl. Acad. Sci. USA. Mar; 87 (6) : 2364-8 . ) were 
introduced into a promoter- less lucif erase reporter vector and 
transfected into IN157 cells with a renilla firefly controls 
Cells were washed and charcoal stripped serum added 5 hours 
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post-transf ection and left overnight. IL-6 amounts were added 
as illustrated with an appropriate amount of II1-6 soluble 
receptor- The cells were left for 24 hours and then assayed 
for activity using a luminometer. 

A dosage dependent transcriptional response was noted in the 
normal, but not the mutated reporter constmict (figure 10) . 
This result indicates the existence of a cytokine mediated 
signalling pathway which causes up-regulation of the B utrophin 
promoter through the interaction of IL-6 and IL-6 receptor with 
the conserved IL-6 response element, 

DiscTission 

We have demonstrated that there is a second promoter within 
intron 2 of the utrophin gene, driving expression of a unique 
first exon that splices into a common 13kb mRNA. These data are 
important, both in terms of understanding the molecular 
physiology of utrophin expression, and in view of their 
application to therapeutic intervention in DMD. 

The functional consequences of genes having more than one 
promoter have been postulated (reviewed in (Ayoubi, et al 
(1996) FASEB J. 10,453-460) . A single gene may achieve a 
complex temporal and spatial expression pattern by interaction 
of different promoters with discrete subsets of transcription 
factors. Dystrophin is an example: three dissimilar promoters 
are active at different levels in specific cell types within 
the heart, skeletal muscle and the brain (Gorecki, et al . 
(1992) Hvim Mol Genet 1, 505-510., Bamea, et al. (1990) Neuron 
5, 881-888, Holder, et al . Human Genetics 97, 232-239) . 
Northern blot analysis, however, indicates that utrophin exon 
IB is ubiquitously expressed, implying that promoters A and B 
are co-expressed in many tissues. It is conceivable that 
examination of transcript distribution in whole tissue samples 
has masked cell type- specif ic patterns of expression. Data 
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from isolated human cell lines in vitro support this notion; 
we observed large differences in promoter B activity between 
different cell lines, consistent with an in vivo expression 
profile involving specific cellular populations . 
Alternatively, the two promoters may be spatially regulated at 
a sub-cellular level. Within adult skeletal muscle fibres, 
promoter A is synaptically driven (Gramolini, et al . (1997) J" 
Biol Chem 272, 8117-20.), yet aggregates of utrophin mRNA are 
detectable at up to 25% extrasynaptic nuclei (Vater, et al . 
(1998) Molecular and cellular Neuroscience 10, 229-242) . 
Expression of promoter B in the extrasynaptic compartment 
might be invoiced as one possible explanation. 

A second proposed function of alternative promoters is the 
generation of transcripts with interchangeable 5' exons, 
giving rise to mRNAs with alternative 5'UTRs or proteins with 
novel N- terminal domains. Unlike exon IB, utrophin exon lA 
contains a long GC-rich 5'UTR. In some transcripts, GC-rich 
5'UTRs are not translated efficiently (Kozak, M. (1991) J Cell 
Biol 115, 887-903.), and there are examples of genes in which 
alternative use of GC-rich and non-GC-rich 5'UTRs has been 
implicated in post- transcriptional regulation of protein 
synthesis (Nielson, et al . (1990) J Biol Chem 265, 13431- 
13434.) . In addition, the predicted 31 amino acids encoded by 
exon IB are different to the 26 amino acids of exon 2A; the 
functions of the resulting N- termini may be different. 

The discovery of a second promoter provides a new target for 
the upregulation of utrophin to ameliorate the DMD phenotype. 
Promoter B is highly regulated, probably by different factors 
from promoter A, including IL-e. Elucidation of the mechanisms 
responsible for the large difference in promoter B activity 
between IN157 and HeLa cells might lead to identification of a 
factor that can be delivered to muscle to activate utrophin 
expression. Importantly, as the N-box motif is absent from 
promoter B, this is unlikely to carry any risk of NMJ 
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disruption potentially inherent in the pharmacological 
manipulation of synaptically regulated promoter A. 
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CLAIMS 

1. An isolated nucleic acid comprising a promoter which 
comprises a sequence of nucleotides selected from (i) the 
human promoter sequence shown in Figure 1 and (ii) the mouse 
promoter sequence shown in Figure 2, free or substantially 
free of utrophin coding sequence. 

2 . An isolated nucleic acid consisting essentially of a 
promoter which comprises the sequence of nucleotides shown 5 ' 
to position 1440 in Figure 1. 

3 . An isolated nucleic acid consisting essentially of a 

promoter which comprises the sequence of nucleotides shown 5 ' 
to position 1183 of the mouse sequence shown in Figure 2. 

4. An isolated nucleic acid consisting essentially of a 
promoter which comprises the nucleotides numbered 1199 -1440 
in the sequence shown in Figure l . 

5 . An isolated nucleic acid consisting essentially of a 
promoter which comprises the nucleotides numbered 959-1183 in 
the sequence shown in Figure 2 . 

6. An isolated nucleic acid consisting essentially of a 
promoter which comprises the nucleotide sequence 
ACAGGACATCCCAGTGTGCAGTTCG . 

7. An isolated nucleic acid consisting essentially of a 
promoter which coir^rises a sec[uence of nucleotides that is an 
allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the 
promoter sequence shown in Figure 1, which sequence has at 
least 60% homology with the promoter sequence shown in figure 
1 and which promoter, when operably linked to a seo[uence of 
nucleotides, has the ability to initiate transcription of that 
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sequence, said transcription being muscle-specif ic . 



8 . An isolated nucleic acid consisting essentially of a 

promoter which comprises a sequence of nucleotides that is an 
allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the 
promoter sequence shown in Figure 2 , which sequence has at 
least 60% homology with the promoter sequence shown in figure 
2 and which promoter, when operably linked to a sequence of 
nucleotides, has the aibility to initiate transcription of that 
sequence, said transcription being muscle-specif ic . 

9. An isolated nucleic acid consisting essentially of a 

promoter which comprises a sequence of nucleotides that is an 
allele, mutant or derivative, by way of addition, insertion, 
deletion or substitution of one or more nucleotides, of the 
promoter sequence shown in Figure 2 , which hybridises to the 
promoter sequence shown in figure 2 under stringent 
hybridisation conditions and which promoter, when operably 
linked to a sequence of nucleotides, has the ability to 
initiate transcription of that sequence, said transcription 
being muscle-specific. 

10 . A nucleic acid construct comprising an isolated nucleic 

acid according to any of the preceding claims operably linked 
to a heterologous sequence . 

11 . A nucleic acid construct according to claim 10 wherein 
the heterologous sequence is a coding sequence. 

12. A nucleic acid construct according to claim 11 wherein 
the heterologous sequence encodes a reporter molecule . 

13 . A host cell comprising a nucleic acid construct 

according to any of claims 10 to 12 . 



wo 01/25461 



PCT/GBOO/03800 



52 

14 . A method comprising culturing a host cell according to 
claim 13 under conditions for transcription of said 
heterologous sequence from the promoter. 

15 . A method according to claim 14 wherein the heterologous 
sequence is a coding sequence and the host cell is cultured 
under conditions for expression of the encoded peptide or 
polypeptide product . 

16 . A method according to claim 14 or claim 15 comprising 
detection of transcription of the heterologous sequence. 

17 . A method according to claim 14 or claim 15 comprising 
detection of expression of the encoded peptide or polypeptide 
product . 

18 . A method of screening for a substance able to modulate 
Utrophin promoter activity, the method comprising contacting 
an expression system containing a nucleic acid construct 
according to any of claims 10 to 12 with a test or candidate 
substance and determining transcription of the heterologous 
sequence or expression of the encoded peptide or polypeptide 
product . 

19 . A method according to claim 18 wherein the expression 

system comprises a host cell containing said nucleic acid 
construct. 

20. A method which comprises, following identification of a 
substance able to modulate utrophin promoter activity in 
accordance with a method according to claim 18 or claim 19, 
manufacture of the substance and/or use of the substance in 
manufacture or formulation of a composition. 

21. The use of an isolated nucleic acid according to any of 
claims 1 to 6 for promoting transcription of an operably 
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linked sequence of nucleotides . 

22. The use of claim 21 wherein the transcription is 

tissue-specific, with the tissue-specificity being muscle- 
specific . 

5 23 . An isolated nucleic acid molecule comprising a 

nucleotide sequence encoding a polypeptide including the amino 
acid sec[uence shown in Figure 1 or Figure 2 . 

24. An isolated nucleic acid molecule comprising a 
nucleotide sequence encoding a polypeptide that is an allele, 

10 mutant or derivative of a polypeptide including the amino acid 
sequence shown in Figure 1 , which amino acid secpjience has at 
least 60% homology with the polypeptide sequence in Figure 1 
or Figure 2 . 

25. An isolated nucleic acid molecule comprising a 

15 nucleotide sequence encoding a polypeptide that is an allele, 
mutant or derivative of a polypeptide shown in Figure 1 or 
Figure 2 , which nucleotide sequence hybridises with the 
nucleotide sequence encoding the polypeptide in Figure 1 or 
Figure 2 under stringent hybridisation conditions . 

20 26. An isolated nucleic acid molecule comprising a 

nucleotide sequence encoding a polypeptide having the amino 
acid sequence shown in Figure 9 . 

27. An isolated nucleic acid molecule comprising the 
nucleotide sequence shown in figure 9 . 

25 

28. Nucleic acid of any one of claims 23 to 27 comprised in 
a vector. 

29. Nucleic acid according to claim 28 wherein said vector 
is an expression vector. 
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30. A host cell containing heterologous nucleic acid 
according to any one of claims 23 to 29 . 

31. A cell according to claim 30 which is a muscle cell. 

32. A cell according to claim 30 wherein said polypeptide 
is expressed. 

33 . A cell according to any of claims 30 to 32 which is in 
a mammal . 

34 . A non-human mammal having a cell according to any of 
claims 30 to 32 . 

35. A non-human mammal containing nucleic acid according to 
any of claims 23 to 29 . 

36. A method including introduction of nucleic acid 
according to any of claims 23 to 29 into a cell. 

37. A method according to claim 36 wherein said 
introduction takes place in vitro. 

38. A method which includes causing or allowing expression 
of the coding nucleotide sequence of heterologous nucleic acid 
according to any of claims 23 to 29 in a cell. 

39. A method according to claim 38 wherein the cell is part 
of a mammal . 

40. A method according to claim 38 wherein the expression 
product is purified and/or isolated following expression, 

41. A method according to claim 40 wherein the expression 
product is formulated into a composition which includes at 
least one additional component, following purification and/or 
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isolation of the expression product . 
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42 . An isolated polypeptide as encoded by nucleic acid 

according to any of claims 23 to 29. 

43. An isolated utrophin exon IB polypeptide selected from: 

(i) human utrophin exon IB polypeptide of which the amino 
acid sequence is shown in Figure 1; 

(ii) mouse utrophin exon IB of which the amino acid sequence 
is shown in Figure 1 . 

44 . An isolated polypeptide including the human polypeptide 
according to claim 43 . 

45 . An isolated polypeptide including the mouse polypeptide 
according to claim 44 . 

46. An isolated polypeptide which has 60 % homology with 
the polypeptide according to claim 44 or 45. 

47. An isolated fragment of a polypeptide according to 
claim 43, which fragments is 5 to 25 amino acids in length. 

48. An isolated fragment of a polypeptide according to 
claim 43, which fragment is 10 to 20 amino acids in length. 

49. An antibody specific for a polypeptide according to any 
one of claims 42 to 48. 

50. A composition including a polypeptide according to any 
one of claims 42 to 46, a fragment according to claim 47 or 
claim 48, or an antibody according to claim 49, and a 
phaarmaceutically acceptable excipient. 

51. Use of nucleic acid according to any of claims 23 to 29 
in the manufacture of a medicament for treating a dystrophin 



wo 01/25461 

phenotype in a maniinal . 

52 . Use of a polypeptide 

48 or an antibody according 
a medicament for treating a 
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according to any of claims 42 to 
to claim 49 in the manufacture of 
dystrophin phenotype in a mammal. 
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(54) Title: UTROPHIN GENE PROMOTER 



Htel Mn «q •<% xdutuy 
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CK xepeat: Alu Sq B3\ idenClty 

650 t.9qtqaaatcccBtctxtactaaacjc^c>CaC>cacac ac a r a r . *c a r acacacacacac*cjcacacaata|y:cgggcatqfl 199 



igaataofccqgaMaaUtftctaaatgirtwicctctccccttatagcc M9 



HtoZ Econ 

9S0 grat*ri)grT*»(|tTTarTttrTrrffraT|ta^rBMTTrt»ajpa»i"1I«»tl*«11»MfrfT rt1'riT|TTTr»rrtfTyraaiijTqrTiTaraai^Ta-rritai|T-flna»riatn»qTnTiatTT.iiTrfftrTflanjTaTaorrMtartrr»Jrli 1099 



9woJZ 

iccqaucrQCcaaaacagguacaaagctataatetccfCcatag 



c-«tal/KM/Xrpn Spl A»] 

yactacaaagtgta»cttecectetggtottcag»gBaggtggqBt.t*gfltEtagtc»gatcctctcatqgq>»aaat.Ma»qcc 13M 



HSGLAATtr 



aaaataa c a c JQoaca<£CCW»IS»CCTGITCaMK »JLllJL I i 1 ifal H,iaJ U , T I CC iCO C AXL nil ILt^aLA TCATCTMOCWICTitCEniGWWBU^ I54S 



t tacatfBacceccaoce*gt<aogtt ttctt— gaaa e gtctatgaagacagggtt cct tea t Ccagts 1C9C 



HiriCVCHL0L9aHVfil.QACKRI,»0| 



(57) Abstract: Second promoter for mouse and human utrophin genes. The promoters or fragments and derivatives may be used to 
control transcription of heterologous sequences, including coding sequences of reporter genes. Expression systems such as host cells 



O 



containing nucleic acid constructs which comprise a iwomoter as provided operably linked to a heterologous sequence may be used 
to screen substances for ability to modulate activity of the utrophin promoter. Substances with such ability may be manufactured 
and/or used in the preparation of compositions such as medicaments. Up-regulation of utrophin expression may compensate for 
dystrophin loss in muscular dystrophy patients. 
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Human B-utrophin up to nucleotide 1500, deduced translation 

CCCAGTGTGCAGTTCGAAGGCTGCTTTTGTTGTCCACTTCCTCCACATCTTTTTCCTCAT 
1 + + + + + + 60 

GGGTCACACGTCAAGCTTCCGACGAAAACAACAGGTGAAGGAGGTGTAGAAAAAGGAGTA 

CATCTAAGCAGATGTAGGTGATGAGCGGCCTGGCAGCCACCACGTTTCATTGGAAAAAGT 
61 + + + + + . + 

GTAGATTCGTCTACATCCACTACTCGCCGGACCGTCGGTGGTGCAAAGTRACCTTTTTCA 

MSGLAATTFHWKKC- 



Exon IB M- 



GCAGATTGGATTTGCCAGGGCATGTAGCTCTCCAGGCTTGCAAGCGATTACCAG 
121 + + + + ' + 



^TGAAC 

---+ 180 



CGTCTAACCTAAACGGTCCCGTACATCGAGAGGTCCGAACGTTCGCTAATGGTC FACTTG 



RLDLPGHVALQACKRL 



H - 



ACAATGACGTACAGAAGAAAACCTTTACCAAATGGATAAATGCTCGATTTTCAAAGAGTG 

181 + + + + + + 24 0 

TGTTACTGCATGTCTTCTTTTGGAAATGGTTTACCTATTTACGAGCTAAAAGTTTCTCAC 

NDVQKKTFTKWIN ARFS KSG- 

GGAAACCACCCATCAATGATATGTTCACAGACCTCAAAGATGGAAGGAAGCTATTGGATC 

241 + + + + + + 300 

CCTTTGGTGGGTAGTTACTATACAAGTGTCTGGAGTTTCTACCTTCCTTCGATAACCTAG 

KPPI NDMFTDLKDGRKLLDL- 

TTCTAGAAGGCCTCACAGGAACATCACTGCCAAAGGAACGTGGTTCCACAAGGGTACATG 

301 + + + + + + 360 

AAGATCTTCCGGAGTGTCCTTGTAGTGACGGTTTCCTTGCACCAAGGTGTTCCCATGTAC 

LEGLTGTSLPKERGSTRVHA- 

CCTTAAATAACGTCAACAGAGTGCTGCAGGTTTTACATCAGAACAATGTGGAATTAGTGA 
361 + + + + + + 420 

GGAATTTATTGCAGTTGTCTCACGACGTCCAAAATGTAGTCTTGTTACACCTTAATCACT 

LNNVNRVLQVLHQNNVELVN- 

ATATAGGGGGAACTGACATTGTGGATGGAAATCACAAACTGACTTTGGGGTTACTTTGGA 

421 + + + + + + 480 

TATATCCCCCTTGACTGTAACACCTACCTTTAGTGTTTGACTGAAACCCCAATGAAACCT 

IGGTDIVDGNHKLTLGLLWS- 

GCATCATTTTGCACTGGCAGGTGAAAGATGTCATGAAGGATGTCATGTCGGACCTGCAGC 

481 + + + + + 540 

CGTAGTAAAACGTGACCGTCCACTTTCTACAGTACTTCCTACAGTACAGCCTGGACGTCG 

I I LHWQVKDVMKDVMSDLQQ- 

AGACGAACAQTGAGAAGATCCTGCTCAGCTGGGTGCGTCAGACCACCAGGCCCTACAGCC 

541 + + + + + + 600 

TCTGCTTGTCACTCTTCTAGGACGAGTCGACCCACGCAGTCTGGTGGTCCGGGATGTCGG 

TNSEKI LLSWVRQTTRPYSQ- 

AAGTCAACGTCCTCAACTTCACCACCAGCTGGACAGATGGACTCGCCTTTAATGCTGTCC 

601 + + + + + + 660 

TTCAGTTGCAGGAGTTGAAGTGGTGGTCGACCTGTCTACCTGAGCGGAAATTACGACAGG 

VNVLNFTTSWTDGLAFNAVL- 

TCCACCGACATAAACCTGATCTCTTCAGCTGGGATAAAGTTGTCAAAATGTCACCAATTG 

661 + + + + + r + 720 

AGGTGGCTGTATTTGGACTAGAGAAGTCGACCCTATTTCAACAGTTTTACAGTGGTTAAC 

HRHK PDLFSWDKVVKMS PI E- 

Figure 8 
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AGAGACTTGAACATGCCTTCAGCAAGGCTCAAACTTATTTGGGAATTGAAAAGCTGTTAG 
721 + + + + + + 

TCTCTGAACTTGTACGGAAGTCGTTCCGAGTTTGAATAAACCCTTAACTTTTCGACAATC 

RLEHAFSKAOTYLGIEKLLD- 

ATCCTGAAGATGTTGCCGTTCGGCTTCCTGACAAGAAATCCATAATTATGTATTTAACAT 
781 + + + + + + g^Q 

TAGGACTTCTACAACGGCAAGCCGAAGGACTGTTCTTTAGGTATTAATACATAAATTGTA 
PEDVAVRL PDKKS I IMY LTS- 

CTTTGTTTGAGGTGCTACCTCAGCAAGTCACCATAGACGCCATCCGTGAGGTAGAGACAC 
841 + + + + + :-_+ 900 

GAAACAAACTCCACGATGGAGTCGTTCAGTGGTATCTGCGGTAGGCACTCCATCTCTGTG 

LFEVL POQVTI DA I REVETL- 

TCCCAAGGAAATATAAAAAAGAATGTGAAGAAGAGGCAATTAATATACAGAGTACAGCGC 

901 + + + + + + 960 

AGGGTTCCTTTATATTTTTTCTTACACTTCTTCTCCGTTAATTATATGTCTCATGTCGCG 

PRKYKKECEEEAINIQSTAP- 

CTGAGGAGGAGCATGAGAGTCCCCGAGCTGAAACTCCCAGCACTGTCACTGAGGTCGACA 

961 + + + - + + + 1020 

GACTCCTCCTCGTACTCTCAGGGGCTCGACTTTGAGGGTCGTGACAGTGACTCCAGCTGT 

EEEHESPRAETPSTVTEVDM- 

TGGATCTGGACAGCTATCAGATTGCGTTGGAGGAAGTGCTGACCTGGTTGCTTTCTGCTG 

1021 + + + + + + lOBO 

ACCTAGACCTGTCGATAGTCTAACGCAACCTCCTTCACGACTGGACCAACGAAAGACGAC 

DLDSYQ 1 ALEEVLTWL LSAE- 

AGGACACTTTCCAGGAGCAGGATGATATTTCTGATGATGTTGAAGAAGTCAAAGACCAGT 

1081 + + + + + + 1140 

TCCTGTGAAAGGTCCTCGTCCTACTATAAAGACTACTACAACTTCTTCAGTTTCTGGTCA 

DT FQEQDDI S DDVEEVKDQF- 

TTGCAACCCATGAAGCTTTTATGATGGAACTGACTGCACACCAGAGCAGTGTGGGCAGCG 

1141 + + + + + + 1200 

AACGTTGGGTACTTCGAAAATACTACCTTGACTGACGTGTGGTCTCGTtACACCCGTCGC 

ATHEAFMMELTAHQSSVGSV- 

TCCTGCAGGCAGGCAACCAACTGATAACACAAGGAACTCTGTCAGACGAAGAAGAATTTG 

1201 + + + + + . + 1260 

AGGACGTCCGTCCGTTGGTTGACTATTGTGTTCCTTGAGACAGTCTGCTTCTTCTTAAAC 

LQAGNOLITOGTUSDEEEFE- 

AGATTCAGGAACAGATGACCCTGCTGAATGCTAGATGGGAGGCTCTTAGGGTGGAGAGTA 

1261 + + + + + + 1320 

TCTAAGTCCTTGTCTACTGGGACGACTTACGATCTACCCTCCGAGAATCCCACCTCTCAT 

IQEQMTL,LNARWEAI.RVESM- 

TGGACAGACAGTCCCGGCTGCACGATGTGCTGATGGAACTGCAGAAGAAGCAACTGCAGC 

1321 + + + + — + + 1380 

ACCTGTCTGTCAGGGCCGACGTGCTACACGACTACCTTGACGTCTTCTTCGTTGACGTCG 

DRQSRLHDVLMELQKKQLQQ- 

AGCTCTCCGCCTGGTTAACACTCACAGAGGAGCGCATTCAGAAGATGGAAACTTGCCCCC 
1381 + + + + . + + 1440 

TCGAGAGGCGGACCAATTGTGAGTGTCTCCTCGCGTAAGTCTTCTACCTTTGAACGGGGG 

LSAWLTLTEERI QKHETCPL- 

TGGATGATGATGTAAAATCTCTACAAAAGCTGCTAGAAGAACATAAAAGTTTGCAAAGTG 

1441 + + + + + + 1500 

ACCTACTACTACATTTTAGAGATGTTTTCGACGATCTTCTTGTATTTTCAAACGTTTCAC 



DODVKSLQKLLEEHKSLQSD- 
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Sequence Range: 1 to 605 9 

10 20 30 40 

ACTAGTCAAG ATGAGCGGCC TGGCAGCCAC CACGTTTCAT 
MSG L A AT TFH 

90 100 110 120 

TCCAGGCTTG CAAGCGATTA CCAGATGAAC ACAATGATGT 
LQAC KRL PDE HNDV 

170 180 190 200 

TCCAAGAGTG GGAAACCACC CATCAGTGAT ATGTTCTCAG 
SKS GK"PP ISD MFS 

250 260 270 280 

CCTCACAGGA ACATCATTGC CAAAGGAACG TGGTTCCACA 
LTG TSL PKER GST 

330 340 350 360 

TTTTACATCA GAACAATGTG GACTTGGTGA ATATTGGAGG 
VLHQ NNV DLV NIGG 

410 420 430 440 

TTACTCTGGA GCATCATTCT GCACTGGCAG GTGAAGGATG 
LLW SIIL HWQ VKD 

490 500 510 520 

CGAGAAGATC CTGCTGAGCT GGGTGCGGCA GACCACCAGG 
EKI L h S WVRQ TTR 

570 580 590 600 

GGACCGATGG ACTCGCGTTC AACGCCGTGC TCCACCGGCA 
WTDG LAP NAV LHRH 

650 660 670 680 

TCCCCAATTG AGAGACTTGA CCATGCTTTT GACAAGGCCC 
SPI ERLD HAF DKA 

730 740 750 760 

TGTTGCTGTG CATCTCCCTG ACAAGAAATC CATAATTATG 
VAV HLP DKKS IIM 

810 820 830 840 

CGATAGATGC CATCCGAGAG GTGGAGACTC TCCCAAGGAA 
TIDA IRE VET LPRK 

890 900 910 920 

AGTGCAGTGC TGGCAGAGGA AGGCCAGAGT CCCCGAGCTG 
SAV LAEE GQS PRA 

970 980 990 1000 

CAGCTACCAG ATAGCGCTAG AGGAAGTGCT GACGTGGCTG 
SYQ lAU EEVL TWL 

1050 1060 1070 1080 

CTGATGATGT CGAAGAAGTC AAAGAGCAGT TTGCTACCCA 
SDDV EEV KEQ PATH 

1130 1140 1150 1160 

GTGGGGAGCG TCCTGCAGGC TGGCAACCAG CTGATGACAC 
VGS VLQA GNQ LMT 

1210 1220 1230 1240 

ACAGATGACC TTGCTGAATG CAAGGTGGGA GGCGCTCCGG 
OMT LLN ARWE ALR 

1290 1300 1310 1320 

TGATGGAGCT GCAGAAGAAA CAGCTGCAGC AGCTCTCAAG 
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50 GO 70 80 

TGGAAAAAGT GCAGATTGGA TTTGCCAGGG CATGTAGCTC 
WKK CRLD L. P G HVA> 

130 140 ISO 160 

ACAGAAGAAA ACCTTTACCA AATGGATAAA CGCTCGATTT 
QKK TFT KWIN ARF> 

210 220 230 240 

ACCTCAAAGA TGGGAGAAAG CTCTTGGATC TTCTCGAAGG 
DLKD GRK LLD LLEG> 

290 300 ,, 310 320 

AGGGTGCATG CCTTAAACAA TGTCAACCGA GTGCTACAGG 
RVH ALNN VNR VLQ> 

370 380 390 400 

CACGGACATT GTGGCTGGAA ATCCCAAGCT GACTTTAGGG 
TDI VAG NPKL TL,G> 

450 460 470 480 

TCATGAAAGA TATCATGTCA GACCTGCAGC AGACAAACAG 
VMKD IMS DLQ QTNS> 

530 540 550 560 

CCCTACAGTC AAGTCAACGT CCTCAACTTC ACCACCAGCT 
PYS QVNV LNF TTS> 

610 620 630 640 

CAAACCAGAT CTCTTCGACT GGGACGAGAT GGTCAAAATG 
KPD LFD WDEM VKM> 

690 700 710 720 

ACACTTCTTT GGGAATTGAA AAGCTCCTAA GTCCTGAAAC 
HTSL GIE KLL SPET> 

770 780 790 800 

TATTTAACGT CTCTGTTTGA GGTGCTTCCT CAGCAAGTCA 
YLT SLFE VLP QQV> 

850 860 870 880 

GTATAAGAAA GAATGTGAAG AGGAAGAAAT TCATATCCAG 
YKK ECE EEEI HIQ^ 

930 940 950 960 

AGACCCCTAG CACCGTCACT GAAGTGGACA TGGATTTGGA 
ETPS TVT EVD MDLD> 

1010 1020 1030 1040 

CTGTCCGCGG AGGACACGTT CCAGGAGCAA CATGACATTT 
LSA ED TF QEQ HDI> 

1090 1100 1110 1120 

TGAAACTTTT ATGATGGAGC TGACAGCACA CCAGAGCAGC 
ETF MME LTAH QSS> 

1170 1180 1190 1200 

AAGGGACTCT GTCCAGAGAG GAGGAGTTTG AGATCCAGGA 
QGTL SRE EEF EIQE> 

1250 1260 1270 1280 

GTGGAGAGCA TGGAGAGGCA GTCCCGGCTG CACGACGCTC 

VES MERQ SRL HDA> 

1330 1340 1350 1360 

CTGGCTGGCC CTCACAGAAG AGCGCATTCA GAAGATGGAG 
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M 



K 



W 



K M 



1370 1380 1390 1400 1410 1420 1430 1440 

AGCCTCCCGC TGGGTGATGA CCTGCCCTCC CTGCAGAAGC TGCTTCAAGA ACATAAAAGT TTGCAAAATG ACCTTGAAGC 
SLP LGDD LPS LQK LLQE HKS LQN DLEA> 

1450 1460 1470 1480 1490 1500 1510 1520 

TGAACAGGTG AAGGTAAATT CCTTAACTCA CATGGTGGTG ATTGTGGATG AAAACAGTGG GGAGAGTGCC ACAGCTCTTC 
EQV KVN SLTH MVV IVD ENSG ESA TAL> 

1530 1540 1550 1550 1570 1580 1590 1600 

TGGAAGATCA GTTACAGAAA CTGGGTGAGC GCTGGACAGC TGTATGCCGC TGGACTGAAG AACGTTGGAA CAGGTTGCAA 
LEDQ LQK LGE RWTA VCR WTE ERWN RLQ> 

1610 1620 1630 1640 1650 1660 . 1670 1680 

GAAATCAGTA TTCTGTGGCA GGAATTATTG GAAGAGCAGT GTCTGTTGGA GGCTTGGCTC ACCGAAAAGG aagaggcttt 
EIS ILWQ ELL EEQ CLLE AWL TEK EEAL> 

1690 1700 1710 1720 1730 1740 1750 1760 

ggataaagtt caaaccagca actttaaaga ccagaaggaa ctaagtgtca gtgtccggcg tctggctata ttgaaggaag 
dkv qts nfkd qke lsv svrr lai lke> 

1770 1780 1790 1800 1810 1820 1830 1^40 

acatggaaat gaagaggcag actctggatc aactgagtga gattggccag gatgtgggcc aattactcag taatcccaag 
dmem krq tld qlse igq dvg qlls npk> 

1850 1860 1870 1880 1890 1900 1910 1920 

GCATCTAAGA AGATGAACAG TGACTCTGAG GAGCTAACAC AGAGATGGGA TTCTCTGGTT CAGAGACTCG AAGACTCTTC 
ASK KMNS DSE ELT QRWD SLV QRL EDSS> 

1930 1940 1950 1960 1970 1980 1990 2000 

TAACCAGGTG ACTCAGGCGG TAGCGAAGCT CGGCATGTCC CAGATTCCAC AGAAGGACCT ATTGGAGACC GTTCATGTGA 
NOV TQA VAKL GMS QIP QKDL LET VHV> 

2010 2020 2030 2040 2050 2060 2070 2080 

GAGAACAAGG GATGGTGAAG AAGCCCAAGC AGGAACTGCC TCCTCCGTTA ACAAAGGCTG AGCATGCTAT GCAAAAGAGA 
REQG MVK KPK QELP PPL TKA EHAM QKR> 

2090 2100 2110 2120 2130 2140 2150 2160 

TCAACCACCG AATTGGGAGA AAACCTGCAA GAATTAAGAG ACTTAACTCA AGAAATGGAA GTACATGCTG AAAAACTCAA 
STT ELGE NLQ ELR DLTQ EME VHA EKLK> 

2170 2180 2190 2200 2210 2220 2230 2240 

ATGGCTGAAT AGAACTGAAT TGGAGATGCT TTCAGATAAA AGTCTGAGTT TACCTGAAAG GGATAAAATT TCAGAAAGCT 
WLN RTE LEML SDK SLS LPER DKl SES> 

2250 2260 2270 2280 2290 2300 2310 2320 

TAAGGACTGT AAATATGACA TGGAATAAGA TTTGCAGAGA GGTGCCTACC ACCCTGAAGG AATGCATCCA GGAGCCCAGT 
LRTV NMT WNK ICRE VPT TLX ECIQ EPS> 

2330 2340 2350 2360 2370 2380 2390 2400 

TCTGTTTCAC AGACAAGGAT TGCTGCTCAT CCTAATGTCC AAAAGGTGGT GCTAGTATCA TCTGCGTCAG ATATTCCTGT 
SVS QTRI AAH PNV QKVV LVS SAS DIPV> 



2410 2420 2430 2440 2450 24G0 2470 2480 

TCAGTCTCAT CGTACTTCGG AAATTTCAAT TCCTGCTGAT CTTGATAAAA CTATAACAGA ACTAGCCGAC TGGCTGGTAT 

QSH RTS EISI PAD LDK TITE LAD WLV> 

2490 2500 2510 2520 2530 2540 2550 2560 

TAATCGACCA GATGCTGAAG TCCAACATTG TCACTGTTGG GGATGTAGAA GAGATCAATA AGACCGTTTC CCGAATGAAA 

LIDQ MLK SNI VTVG DVE EIN- K TVS RMK> 

2570 2580 2590 2600 2610 2620 2630 2640 

ATTACAAAGG CTGACTTAGA ACAGCGCCAT CCTCAGCTGG ATTATGTTTT TACATTGGCA CAGAATTTGA AAAATAAAGC 

ITK ADLE QRH PQL DYVF TLA QNL KNKA> 

2650 2660 2670 2680 2690 2700 2710 2720 

TTCCAGTTCA GATATGAGAA CAGCAATTAC AGAAAAATTG GAAAGGGTCA AGAACCAGTG GGATGGCACC CAGCATGGCG 
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D M 



K 



N 



W 



H G> 



2730 2740 2750 2760 2770 2780 2790 2800 

TTGAGCTAAG ACAGCAGCAG CTTGAGGACA TGATTATTGA CAGTCTTCAG TGGGATGACC ATAGGGAGGA GACTGAAGAA 

VELR QQQ LE D MIID SLQ WDD HREE TEE> 

2810 2920 2830 2840 2850 2860 2870 2880 

CTGATGAGAA AATATGAGGC TCGACTCTAT ATTCTTCAGC AAGCCCGACG GGATCCACTC ACCAAACAAA TTTCTGATAA 

LMR KYE A RLY ILQ QARR DPL TKQ ISDN> 

2890 2900 2910 2920 2930 2940 2950 2960 

CCAAATACTG CTTCAAGAAC TGGGTCCTGG AGATGGTATC GTCATGGCGT TCGATAACGT CCTGCAGAAA CTCCTGGAGG 

QIL LQE LGPG DGI VMA FDNV LQK LLE> 



2970 



2980 2990 3000 3010 3020 3030 3040 

AATATGGGAG TGATGACACA AGGAATGTGA AAGAAACCAC AGAGTAC-rrA AAAACATCAT GGATCAATCT CAAACAAAGT 
T RNV KETT EYL KTS WINL KQS> 



D 



D 



3050 3060 3070 3080 3090 3100 3110 3120 

ATTGCTGACA GACAGAACGC CTTGGAGGCT GAGTGGAGGA CGGTGCAGGC CTCTCGCAGA GATCTGGAAA ACTTCCTGAA 
IAD RQNA LEA EWR TVQA SRR DUE NFLK> 

3130 3140 3150 3160 3170 3180 3190 3200 

GTGGATCCAA GAAGCAGAGA CCACAGTGAA TGTGCTTGTG GATGCCTCTC ATCGGGAGAA TGCTCTTCAG GATAGTATCT 
WIQ EA E TTVN V1.V DAS HREN ALQ DSI> 

3210 3220 3230 3240 3250 3260 3270 3280 

TGGCCAGGGA ACTCAAACAG CAGATGCAGG ACATCCAGGC AGAAATTGAT GCCCACAATG ACATATTTAA AAGCATTGAC 
LARE LKQ QMQ DIQA EID AHN DIFK SID> 

3290 3300 3310 3320 3330 3340 3350 3360 

GGAAACAGGC AGAAGATGGT AAAAGCTTTG GGAAATTCTG AAGAGGCTAC TATGCTTCAA CATCGACTGG ATGATATGAA 
GNR QKMV KAL GN S BEAT MLQ HRL DDMN> 

3370 3380 3390 3400 3410 3420 3430 3440 

CCAAAGATGG AATGACTTAA AAGCAAAATC TGCTAGCATC AGGGCCCATT TGGAGGCCAG CGCTGAGAAG TGGAACAGGT 
QRW NDL KAKS ASI RAH LEAS AEK WNR> 

3450 3460 3470 3480 3490 3500 3510 3520 

TGCTGATGTC CTTAGAAGAA CTGATCAAAT GGCTGAATAT GAAAGATGAA GAGCTTAAGA AACAAATGCC TATTGGAGGA 
LLMS LEE LIK W LNM KDE ELK KQM P IGG> 

3530 3540 3550 3560 3570 3580 3590 3600 

GATCTTCCAG CCTTACAGCT CCAGTATGAC CATTGTAAGG CCCTGAGACG GGAGTTAAAG GAGAAAGAAT ATTCTGTCCT 
DVp ALOL QYD HCK ALRR ELK EKE YSVL> 

3C10 3620 3630 3640 3650 3660 3670 3680 

GAATCCTGTC GACCAGGCCC GAGTTTTCTT GGCTGATCAG CCAATTGAGG CCCCTGAAGA GCCAAGAAGA AACCTACAAT 
NAV DQA RVFL ADQ PIE APBE PRR NLQ> 

3690 3700 3710 3720 3730 3740 3750 3760 

CAAAAACAGA ATTAACTCCT GAGGAGAGAG CCCAAAAGAT TGCCAAAGCC ATGCGCAAAC AGTCTTCTGA AGTCAAAGAA 
SKTE LTP EER AQKI AKA MRK QSSE VKE> 
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AAATGGG/xAA GTCTAAATGC TGTAACTAGC AATTGGCAAA AGCAAGTGGA CAAGGCATTG GAGAAACTCA GAGACCTGCA 
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GGGAGCTATG GATGACCTGG ACGCTGACAT GAAGGAGGCA GAGTCCGTGC GGAATGGCTG GAAGCCCGTG GGAGACTTAC 
GAM DDL DADM KEA ESV RNGW KPV GDL> 
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TCATTGACTC GCTGCAGGAT CACATTGAAA AAATCATGGC ATTTAGAGAA GAAATTGCAC CAATCAACTT TAAAGTTAAA 
LIDS LQD HIE KIMA FRE EIA PINF KVK> 
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SEQUENCE LISTING 



<110> Burton, Edward 

Tinsley, Jonathan 
Davies, Kay 

<120> Utrophin Gene Promoter 

<130> P02428US0 

<140> TBA 

<141> 04/04/2002 

<150> PCT/GBOO/03800 
<151> 10/04/2000 

<150> GB 9923423.9 
<151> 10/04/1999 

<160> 30 

<170> Patentin Ver. 2.1 

<210> 1 

<211> 1197 

<212> DNA 

<213> Homo sapiens 



<400> 1 

tttctatttc 

atatgaataa 

agcaggagtt 

cacacacaca 

gcacctgtaa 

cggaggttgc 

aaaaataatg 

gtctaaatgg 

agcgaattct 

aactactata 

aaactcctaa 

cgcagtgggt 

gcaaagttat 

aaaacaaaaa 

gtttagagga 

aaaaaaaaaa 

ctgcttttgt 

atgagcggcc 

catgtagctc 

cagccagtga 



acaacaagca 
tgattttcct 
cgagaccagc 
cacacacaca 
tcccagctac 
agtgagctga 
ataataaaga 
tggcctcttc 
aagggatgaa 
gtgaaataat 
ggccagttgt 
agggaggtgg 
aatctctgtc 
acctgcctaa 
ggtggggtta 
aaaaaaaaaa 
tgtccacttc 
tggcagccac 
tccaggcttg 
ggttttctta 



agaaaaagaa 
tgctttttgc 
ctgaccaaca 
cacacacaca 
ttgggaggct 
gatcatgcca 
gagcaaggtg 
tcttatagct 
gaagaaatcc 
aagtccaatt 
atacccaggg 
gtggagtgcc 
ataggaacat 
ggagttttca 
ggtttagtca 
cccaaaataa 
ctccacatct 
cacgtttcat 
caagcgatta 
agaaacgtct 



tgagagaagg 
atgtatgtgg 
tggtgaaatc 
cacacacaca 
gaggcacaag 
ttgcactcca 
accacaaaag 
gcatatggtt 
ttttcagttt 
tattctttga 
caaacgcctt 
ccttcccagc 
gaatagaggc 
ctgactacaa 
gatcctctca 
cacaggacat 
ttttcctcat 
tggaaaaagt 
ccaggtaagt 
atgaagacag 



actagaaagt 

tggacacatg 

ccgtctctac. 

atagccgggc 

aatgacttga 

gcctgggtga 

agaataggct 

aagtttattt 

tacttcccca 

agtatagtta 

ctaacatctt 

tgatactgtc 

ccttagttgt 

agtgtaactt 

tgggaaaaat 

cccagtgtgc 

catctaagca 

gcagattgga 

ttgtcaactt 

ggttctttca 



agatgtgatc 
cagaagtgac 
taaacacaca 
atggtggtgg 
acccaggagg 
cgagtgaaaa 
ggaaaaattt 
tttccctagt 
aggtgtgtat 
atatgtaacg 
tatttatcta 
aaaacaggaa 
gactat taaa 
cctctctggt 
aaaagccacc 
agttcgaagg 
gatgtaggtg 
tttgccaggg 
gcacgactcc 
ttcagtt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1197 



<210> 2 
<211> 32 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Met Ser Gly Leu Ala Ala Thr Thr Phe His Trp Lys Lys Cys Arg Leu 
15 10 15 

Asp Leu Pro Gly His Val Ala Leu Gin Ala Cys Lys Arg Leu Pro Asp 
20 25 30 



1 



<210> 3 
<211> 1145 
<212> DNA 
<213> Mus sp. 



<220> 

<221> raisc_f eature 
<222> (120) 

<223> n = a or g or c or t 



<220> 

<221> inisc_f eature 
<222> (568) 

<223> n = a or g or c or t 



<400> 3 

tactacgtgg gttatagcag taaactgggt tttgactaag tgacatgact ggagccattc 60 

tgattcttta ctgtctcacc ccatcttatt ccgttggagg atgaggatca gaggacagan 120 

tgcttagttig ttttttccag agtctcaagt ctatggtctt ctgagctaca tagataggtt 180 

ccttttactt ggaactcctg tggaccctgg tagggttaca tattctgtga gaatctttgt 240 

gctaggtacg gattctgttt cagaggagga aagaaagcta ttagatccat actaaggatg 300 

caggcatggc agtacaaaca cctttccttc tcttttgcac gtgtgtggag aacacatatg 360 

caaatgatgt caagagaaca aaacaaccat ctaaaacaga agtctggaaa atatgagtct 420 

gtgtggttat tgtttttttc caccgtagca gtttctttct cttttccttt gtggtttttg 480 

gagacagggt ttctctatgt agccctggct gtcttggagc ttacactgta gaccaggctg 540 

gccttgaact cacagagatc cacctgcntc tgcctcctgt gtgggagtaa aggcgtgtac 600 

caccaccaaa gtaaacactg ttgtgagtat gcatagtggg gtgtgtgtgt gtgtgtgtgc 660 

tgtcagacac catcaaacaa gaaaagttag catctctcta gttgctttgg aacattcaaa 720 

agctctaagc tgtgactatt aaaaaccaaa agtacctcaa gagttcttaa ctgactgcgg 780 

agtttaactt cctgtctgag gggaggtgga gttagattta gtcagatcct ctcgtgggaa 840 

aaaatcaaag ggactttaaa aaagaaaaaa acaaaaccca acctaacagg acatcccagt 900 

gtgcagttcg cgggcggctt ttgtgttgat ttccttcaca gtttccctca tctcagccac 960 

tgtaggtgat gagcagcctg gcagccacca catttcgttg gaaaaagtgg aggttggatc 1020 

tgcctgggca ggtgcctctc caggcttgca ggagatcccc cggtaagttt gtcagtggcc 1080 

agactgcagt tgctaaggga ggctttggac agagggtgtt cgagttggca gagcctcact 1140 
ttctc 1145 



<210> 4 
<211> 32 
<212> PRT 
<213> Mus sp. 

<400> 4 

Met Ser Ser Leu Ala Ala Thr Thr Phe Arg Trp Lys Lys Trp Arg Leu 
15 10 15 

Asp Leu Pro Gly Gin Val Pro Leu Gin Ala Cys Arg Arg Ser Pro Asp 
20 25 30 



<210> 5 

<211> 1500 

<212> DNA 

<213> Homo sapiens 



<400> 5 

cccagtgtgc agttcgaagg ctgcttttgt tgtccacttc ctccacatct ttttcctcat 60 

catctaagca gatgtaggtg atgagcggcc tggcagccac cacgtttcat tggaaaaagt 120 

gcagattgga tttgccaggg catgtagctc tccaggcttg caagcgatta ccagatgaac 180 



acaatgacgt acagaagaaa acctttacca 
ggaaaccacc catcaatgat atgttcacag 
ttctagaagg cctcacagga acatcactgc 
ccttaaataa cgtcaacaga gtgctgcagg 
atataggggg aactgacatt gtggatggaa 
gcatcatttt gcactggcag gtgaaagatg 
agacgaacag tgagaagatc ctgctcagct 
aagtcaacgt cctcaacttc accaccagct 
tccaccgaca taaacctgat ctcttcagct 
agagacttga acatgccttc agcaaggctc 
atcctgaaga tgttgccgtt cggcttcctg 
ctttgtttga ggtgctacct cagcaagtca 
tcccaaggaa atataaaaaa gaatgtgaag 
ctgaggagga gcatgagagt ccccgagctg 
tggatctgga cagctatcag attgcgttgg 
aggacacttt ccaggagcag gatgatattt 
ttgcaaccca tgaagctttt atgatggaac 
tcctgcaggc aggcaaccaa ctgataacac 
agattcagga acagatgacc ctgctgaatg 
tggacagaca gtcccggctg cacgatgtgc 
agctctccgc ctggttaaca ctcacagagg 
tggatgatga tgtaaaatct ctacaaaagc 



aatggataaa tgctcgattt tcaaagagtg 240 
acctcaaaga tggaaggaag ctattggatc 300 
caaaggaacg tggttccaca agggtacatg 360 
ttttacatca gaacaatgtg gaattagtga 420 
atcacaaact gactttgggg ttactttgga 480 
tcatgaagga tgtcatgtcg gacctgcagc 540 
gggtgcgtca gaccaccagg ccctacagcc 600 
ggacagatgg actcgccttt aatgctgtcc 660 
gggataaagt tgtcaaaatg tcaccaattg 720 
aaacttattt gggaattgaa aagctgttag 780 
acaagaaate cataattatg tatttaacat 840 
ccatagacgc catccgtgag gtagagacac 900 
aagaggcaat taatatacag agtacagcgc 960 
aaactcccag cactgtcact gaggtcgaca 1020 
aggaagtgct gacctggttg ctttctgctg 1080 
ctgatgatgt tgaagaagtc aaagaccagt 1140 
tgactgcaca ccagagcagt gtgggcagcg 1200 
aaggaactct gtcagacgaa gaagaatttg 1260 
ctagatggga ggctcttagg gtggagagta 1320 
tgatggaact gcagaagaag caactgcagc 1380 
agcgc.attca gaagatggaa acttgccccc 14 40 
tgctagaaga acataaaagt ttgcaaagtg 1500 



<210> 6 

<211> 1500 

<212> DNA 

<213> Homo sapiens 

<400> 6 

cactttgcaa acttttatgt tcttctagca 
gggggcaagt ttccatcttc tgaatgcgct 
gctgcagttg cttcttctgc agttccatca 
tactctccac cctaagagcc tcccatctag 
caaattcttc ttcgtctgac agagttcctt 
cgctgcccac actgctctgg tgtgcagtca 
actggtcttt gacttcttca acatcatcag 
cagcagaaag caaccaggtc agcacttcct 
tgtcgacctc agtgacagtg ctgggagttt 
gcgctgtact ctgtatatta attgcctctt 
gtgtctctac ctcacggatg gcgtctatgg 
atgttaaata cataattatg gatttcttgt 
ctaacagctt ttcaattccc aaataagttt 
caattggtga cattttgaca actttatccc 
ggacagcatt aaaggcgagt ccatctgtcc 
ggctgtaggg cctggtggtc tgacgcaccc 
gctgcaggtc cgacatgaca tccttcatga 
tccaaagtaa ccccaaagtc agtttgtgat 
tcactaattc cacattgttc tgatgtaaaa 
catgtaccct tgtggaacca cgttcctttg 
gatccaatag cttccttcca tctttgaggt 
cactctttga aaatcgagca tttatccatt 
gttcatctgg taatcgcttg caagcctgga 
actttttcca atgaaacgtg gtggctgcca 
atgaggaaaa agatgtggag gaagtggaca 



gcttttgtag agattttaca tcatcatcca 60 
cctctgtgag tgttaaccag gcggagagct 120 
gcacatcgtg cagccgggac tgtctgtcca 180 
cattcagcag ggtcatctgt tcctgaatct 240 
gtgttatcag ttggttgcct gcctgcagga 300 
gttccatcat aaaagcttca tgggttgcaa 360 
aaatatcatc ctgctcctgg aaagtgtcct 420 
ccaacgcaat ctgatagctg tccagatcca 480 
cagctcgggg actctcatgc tcctcctcag 540 
cttcacattc ttttttatat ttccttggga 600 
tgacttgctg aggtagcacc tcaaacaaag 660 
caggaagccg aacggcaaca tcttcaggat 720 
gagccttgct gaaggcatgt tcaagtctct 780 
agctgaagag atcaggttta tgtcggtgga 840 
agctggtggt gaagttgagg acgttgactt 900 
agctgagcag gatcttctca ctgttcgtct 960 
catcttteac ctgccagtgc aaaatgatgc 1020 
ttccatccac aatgtcagtt ccccctatat 1080 
cctgcagcac tctgttgacg ttatttaagg 1140 
gcagtgatgt tcctgtgagg ccttctagaa 1200 
ctgtgaacat atcattgatg ggtggtttcc 1260 
tggtaaaggt tttcttctgt acgtcattgt 1320 
gagctacatg ccctggcaaa tccaatctgc 1380 
ggccgctcat cacctacatc tgcttagatg 14 4 0 
acaaaagcag ccttcgaact gcacactggg 1500 
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<210> 7 
<211> 474 
<212> PRT 

<213> Homo sapiens 

<400> 7 

Met Ser Gly Leu Ala Ala Thr Thr Phe His Trp Lys Lys Cys Arg Leu 
15 10 15 

Asp Leu Pro Gly His Val Ala Leu Gin Ala Cys Lys Arg Leu Pro Asp 
20 25 30 

Glu His Asn Asp Val Gin Lys Lys Thr Phe Thr Lys Trp lie Asn Ala 

35 40 45 

Arg Phe Ser Lys Ser Gly Lys Pro Pro He Asn Asp Met Phe Thr Asp 
50 55 60 

Leu Lys Asp Gly Arg Lys Leu Leu Asp Leu Leu Glu Gly. Leu Thr Gly 
65 70 75 80 

Thr Ser Leu Pro Lys Glu Arg Gly Ser Thr Arg Val His Ala Leu Asn 
85 90 95 

Asn Val Asn Arg Val Leu Gin Val Leu His Gin Asn Asn Val Glu Leu 
100 105 110 

Val Asn He Gly Gly Thr Asp He Val Asp Gly Asn His Lys Leu Thr 
115 120 125 

Leu Gly Leu Leu Trp Ser He He Leu His Trp Gin Val Lys Asp Val 
130 135 140 

Met Lys Asp Val Met Ser Asp Leu Gin Gin Thr Asn Ser Glu. Lys He 
145 150 155 160 

Leu Leu Ser Trp Val Arg Gin Thr Thr Arg Pro Tyr Ser Gin Val Asn 

165 170 175 

Val Leu Asn Phe Thr Thr Ser Trp Thr Asp Gly Leu Ala Phe Asn Ala 
180 185 190 

Val Leu His Arg His Lys Pro Asp Leu Phe Ser Trp Asp Lys Val Val 
195 200 205 

Lys Met Ser Pro He Glu Arg Leu Glu His Ala Phe Ser Lys Ala Gin 
210 215 220 

Thr Tyr Leu Gly He Glu Lys Leu Leu Asp Pro Glu Asp Val Ala Val 
225 230 235 240 

Arg Leu Pro Asp Lys Lys Ser He He Met Tyr Leu Thr Ser Leu Phe 
245 250 255 

Glu Val Leu Pro Gin Gin Val Thr He Asp Ala He Arg Glu Val Glu 
260 265 270 

Thr Leu Pro Arg Lys Tyr Lys Lys Glu Cys Glu Glu Glu Ala He Asn 
275 280 285 

He Gin Ser Thr Ala Pro Glu Glu Glu His Glu Ser Pro Arg Ala Glu 
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290 295 300 

Thr Pro Ser Thr Val Thr Glu Val Asp Met Asp Leu Asp Ser Tyr Gin 
305 310 315 320 

lie Ala Leu Glu Glu Val Leu Thr Trp Leu Leu Ser Ala Glu Asp Thr 
325 330 335 

Phe Gin Glu Gin Asp Asp lie Ser Asp Asp Val Glu Glu Val Lys Asp 
340 345 350 

Gin Phe Ala Thr His Glu Ala Phe Met Met Glu Leu Thr Ala His Gin 
355 360 365 

Ser Ser Val Gly Ser Val Leu Gin Ala Gly Asn Gin Leu lie Thr Gin 

370 375 380 

Gly Thr Leu Ser Asp Glu Glu Glu Phe Glu lie Gin Glu Gin Met Thr 

385 390 395 400 

Leu Leu Asn Ala Arg Trp Glu Ala Leu Arg Val Glu Ser Met Asp Arg 
405 410 415 

Gin Ser Arg Leu His Asp Val Leu Met Glu Leu Gin Lys Lys Gin Leu 
420 425 430 

Gin Gin Leu Ser Ala Trp Leu Thr Leu Thr Glu Glu Arg lie Gin Lys 
435 440 445 

Met Glu Thr Cys Pro Leu Asp Asp Asp Val Lys Ser Leu Gin Lys Leu 
450 455 .460 

Leu Glu Glu His Lys Ser Leu Gin Ser Asp 
465 470 



<210> 8 
<211> 6059 
<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (11) . . (6052) 

<220> 

<223> Description of Artificial Sequence: Utrophin B 
isoform "minigene" 

<400> 8 

actagtcaag atg age ggc ctg gca gee acc acg ttt cat tgg aaa aag 
Met Ser Gly Leu Ala Ala Thr Thr Phe His Trp Lys Lys 
15 10 

tgc aga ttg gat ttg cca ggg cat gta get etc cag get tgc aag ega 
Cys Arg Leu Asp Leu Pro Gly His Val Ala Leu Gin Ala Cys Lys Arg 
15 20 . 25 

tta cca gat gaa cac aat gat gta eag aag aaa acc ttt acc aaa tgg 
Leu Pro Asp Glu His Asn Asp Val Gin Lys Lys Thr Phe Thr Lys Trp 
30 35 40 45 



49 



97 



145 
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ata aac get cga 
lie Asn Ala Arg 



ttc tea gac etc 
Phe Ser Asp Leu 
65 

etc aca gga aca 
Leu Thr Gly Thr 
80 

gee tta aac aat 
Ala Leu Asn Asn 
95 

gtg gac ttg gtg 
Val Asp Leu Val 
110 

aag ctg act tta 
Lys Leu Thr Leu 



aag gat gtc atg 
Lys Asp Val Met 
145 

gag aag ate ctg 
Glu Lys lie Leu 
160 

caa gtc aac gtc 
Gin Val Asn Val 
175 

ttc aac gcc gtg 
Phe Asn Ala Val 
190 

gag atg gtc aaa 
Glu Met Val Lys 



aag gcc cac act 
Lys Ala His Thr 
225 

gtt get gtg cat 
Val Ala Val His 
240 

tct ctg ttt gag 
Ser Leu Phe Glu 
255 

gag gtg gag act 
Glu Val Glu Thr 
270 



ttt tee aag 
Phe Ser Lys 
50 

aaa gat ggg 
Lys Asp Gly 



tea ttg cca 
Ser Leu Pro 



gtc aac cga 
Val Asn Arg 
100 

aat att gga 
Asn lie Gly 
115 

ggg tta etc 
Gly Leu Leu 
130 

aaa gat ate 
Lys Asp lie 



ctg age tgg 
Leu Ser Trp 



etc aac ttc 
Leu Asn Phe 
180 

etc cac egg 
Leu His Arg 
195 

atg tec cca 
Met Ser Pro 
210 

tct ttg gga 
Ser Leu Gly 



etc ect gac 
Leu Pro Asp 



gtg ctt eet 
Val Leu Pro 
260 

etc cca agg 
Leu Pro Arg 
275 



agt ggg aaa 
Ser Gly Lys 
55 

aga aag etc 
Arg Lys Leu 
70 

aag gaa cgt 
Lys Glu Arg 
85 

gtg eta cag 
Val Leu Gin 



gge aeg gac 
Gly Thr Asp 



tgg age ate 
Trp Ser lie 
135 

atg tea gae 
Met Ser Asp 
150 

gtg egg eag 
Val Arg Gin 
165 

acc ace age 
Thr Thr Ser 



cac aaa cca 
His Lys Pro 



att gag aga 
lie Glu Arg 
215 

att gaa aag 
lie Glu Lys 
230 

aag aaa tee 
Lys Lys Ser 
245 

cag caa gtc 
Gin Gin Val 



aag tat aag 
Lys Tyr Lys 



cca cec ate agt 
Pro Pro lie Ser 



ttg gat ctt etc 
Leu Asp Leu Leu 
75 

ggt tec aca agg 
Gly Ser Thr Arg 
90 

gtt tta cat eag 
Val Leu His Gin 
105 

att gtg get gga 
lie Val Ala Gly 
120 

att ctg cae tgg 
lie Leu His Trp 



ctg cag cag aca 
Leu Gin Gin Thr 
155 

acc ace agg ccc 
Thr Thr Arg Pro 
17 0 

tgg ace gat gga 
Trp Thr Asp Gly 
185 

gat etc ttc gac 
Asp Leu Phe Asp 
200 

ctt gac cat get 
Leu Asp His Ala 



etc eta agt ect 
Leu Leu Ser Pro 
235 

ata att atg tat 
lie lie Met Tyr 
250 

aeg ata gat gcc 
Thr lie Asp Ala 
265 

aaa gaa tgt gaa 
Lys Glu Cys Glu 
280 



gat atg 193 
Asp Met 
60 

gaa ggc 241 
Glu Gly 



gtg eat 289 
Val His 



aac aat 337 
Asn Asn 



aat ccc 385 
Asn Pro 
125 

cag gtg 433 

Gin Val 

140 

aac age 481 
Asn Ser 



tac agt 52 9 
Tyr Ser 



etc geg 577 
Leu Ala 



tgg gae 625 
Trp Asp 
205 

ttt gae 673 

Phe Asp 

220 

gaa act 721 
Glu Thr 



tta aeg 769 
Leu Thr 



ate cga 817 
lie Arg 



gag gaa 865 
Glu Glu 
285 
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gaa att cat ate cag agt gca gtg ctg gca gag gaa ggc cag agt ccc 
Glu lie His lie Gin Ser Ala Val Leu Ala Glu Glu Gly Gin Ser Pro 
290 295 300 

cga get gag acc cct age acc gtc act gaa gtg gac atg gat ttg gac 
Arg Ala Glu Thr Pro Ser Thr Val Thr Glu Val Asp Met Asp Leu Asp 
305 310 315 

age tac cag ata gcg eta gag gaa gtg ctg acg tgg ctg ctg tec gcg 
Ser Tyr Gin lie Ala Leu Glu Glu Val Leu Thr Trp Leu Leu Ser Ala 
320 325 330 

gag gac acg ttc cag gag caa cat gac att tct gat gat gte gaa gaa 
Glu Asp Thr Phe Gin Glu Gin His Asp He Ser Asp Asp Val Glu Glu 
335 340 345 

gtc aaa gag cag ttt get acc cat gaa act ttt atg atg gag ctg aca 
Val Lys Glu Gin Phe Ala Thr His Glu Thr Phe Met Met Glu Leu Thr 
350 355 360 365 

gea cac cag age age gtg ggg age gtc ctg cag get ggc aac cag ctg 1153 
Ala His Gin Ser Ser Val Gly Ser Val Leu Gin Ala Gly Asn Gin Leu 
370 375 380 



atg aca caa ggg act ctg tee aga gag gag gag ttt gag ate eag gaa 
Met Thr Gin Gly Thr Leu Ser Arg Glu Glu ,Glu Phe Glu He Gin Glu 
385 390 395 



961 



1009 



1057 



1105 



1201 



cag atg acc ttg ctg aat gca agg tgg gag gcg etc egg gtg gag age 124 9 
Gin Met Thr Leu Leu Asn Ala Arg Trp Glu Ala Leu Arg Val Glu Ser 
400 405 410 

atg gag agg cag tec egg ctg cac gac get ctg atg gag ctg eag aag 1297 
Met Glu Arg Gin Ser Arg Leu His Asp Ala Leu Met Glu Leu Gin Lys 
415 420 425 

aaa cag ctg cag cag etc tea age tgg ctg gee etc aca gaa gag cgc 1345 
Lys Gin Leu Gin Gin Leu Ser Ser Trp Leu Ala Leu Thr Glu Glu Arg 
430 435 440 445 

att cag aag atg gag age etc ccg ctg ggt gat gac ctg ccc tec ctg 1393 
He Gin Lys Met Glu Ser Leu Pro Leu Gly Asp Asp Leu Pro Ser Leu 
450 455 460 

cag aag ctg ctt caa gaa cat aaa agt ttg caa . aat gac ett gaa get 1441 
Gin Lys Leu Leu Gin Glu His Lys Ser Leu Gin Asn Asp Leu Glu Ala 
465 470 475 

gaa cag gtg aag gta aat tec tta act cac atg gtg gtg att gtg gat 1489 
Glu Gin Val Lys Val Asn Ser Leu Thr His Met Val Val He Val Asp 
480 485 490 

gaa aac agt ggg gag agt gee aca get ctt ctg gaa gat cag tta cag 1537 
Glu Asn Ser Gly Glu Ser Ala Thr Ala Leu Leu Glu Asp Gin Leu Gin 
495 500 505 

aaa ctg ggt gag cgc tgg aca get gta tge cgc tgg act gaa gaa egt 1585 
Lys Leu Gly Glu Arg Trp Thr Ala Val Cys Arg Trp Thr Glu Glu Arg 
510 515 520 525 

tgg aac agg ttg caa gaa ate agt att ctg tgg cag gaa tta ttg gaa 1633 
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Trp Asn Arg Leu Gin Glu lie Ser lie Leu Trp Gin Glu Leu Leu Glu 

530 535 540 

gag cag tgt ctg ttg gag get tgg etc acc gaa aag gaa gag get ttg 1681 

Glu Gin Cys Leu Leu Glu Ala Trp Leu Thr Glu Lys Glu Glu Ala Leu 

545 550 555 



gat aaa gtt caa acc age aac ttt aaa gac cag aag gaa eta agt gtc 
Asp Lys Val Gin Thr Ser Asn Phe Lys Asp Gin Lys Glu Leu Ser Val 
560 565 570 



aca tgg aat aag att tgc aga gag gtg cct acc acc ctg aag gaa tgc 
Thr Trp Asn Lys lie Cys Arg Glu Val Pro Thr Thr Leu Lys Glu Cys 
750 755 760 765 



1729 



agt gtc egg cgt ctg get ata ttg aag gaa gac atg gaa atg aag agg 1777 
Ser Val Arg Arg Leu Ala lie Leu Lys Glu Asp Met Glu Met Lys Arg 

575 580 585 

cag act ctg gat caa ctg agt gag att ggc cag gat gtg ggc caa tta 1825 
Gin Thr Leu Asp Gin Leu Ser Glu lie Gly Gin Asp Val Gly Gin Leu 
590 595 600 605 

etc agt aat ccc aag gca tct aag aag atg aac agt gac tct gag gag 1873 
Leu Ser Asn Pro Lys Ala Ser Lys Lys Met Asn Ser Asp Ser Glu Glu 
610 615 620 

eta aca cag aga tgg gat tct ctg gtt cag aga etc gaa gac tct tct 
Leu Thr Gin Arg Trp Asp Ser Leu Val Gin Arg Leu Glu Asp Ser Ser 
625 630 635 

aac cag gtg act cag gcg gta geg aag etc ggc atg tec cag att cca 
Asn Gin Val Thr Gin Ala Val Ala Lys Leu Gly Met Ser Gin lie Pro 
640 645 650 

cag aag gac eta ttg gag ace gtt cat gtg aga gaa caa ggg atg gtg 
Gin Lys Asp Leu Leu Glu Thr Val His Val Arg Glu Gin Gly Met Val 
655 660 665 

aag aag ccc aag cag gaa ctg cct cct ccg. tta aca aag get gag cat 
Lys Lys Pro Lys Gin Glu Leu Pro Pro Pro Leu Thr Lys Ala Glu His 
670 675 680 685 

get atg caa aag aga tea acc ace gaa ttg gga gaa aac ctg caa gaa 2113 
Ala Met Gin Lys Arg Ser Thr Thr Glu Leu Gly Glu Asn Leu Gin Glu 
690 695 700 

tta aga gac tta act caa gaa atg gaa gta cat get gaa aaa etc aaa 2161 
Leu Arg Asp Leu Thr Gin Glu Met Glu Val His Ala Glu Lys Leu Lys 
705 710 715 

tgg ctg aat aga act gaa ttg gag atg ett tea gat aaa aigt ctg agt 2209 
Trp Leu Asn Arg Thr Glu Leu Glu Met Leu Ser Asp Lys Ser Leu Ser 
720 725 730 

tta cct gaa agg gat aaa att tea gaa age tta agg act gta aat atg 2257 
Leu Pro Glu Arg Asp Lys lie Ser Glu Ser Leu Arg Thr Val Asn Met 
735 740 745 



1921 



1969 



2017 



2065 



2305 



ate cag gag ccc agt tct gtt tea cag aca agg att get get cat cct 2353 
lie Gin Glu Pro Ser Ser Val Ser Gin Thr Arg lie Ala Ala His Pro 



8 



770 



775 



780 



aat gtc caa aag gtg gtg eta gta tea tct gcg tea gat att cct gtt 2401 
Asn Val Gin Lys Val Val Leu Val Ser Ser Ala Ser Asp lie Pro Val 
785 790 795 

cag tct cat cgt act teg gaa att tea att cct. get gat ctt gat aaa 24 4 9 
Gin Ser His Arg Thr Ser Glu lie Ser lie Pro Ala Asp Leu Asp Lys 
800 805 810 

act ata aca gaa eta gcc gac tgg ctg gta tta ate gac cag atg ctg 24 97 
Thr lie Thr Glu Leu Ala Asp Trp Leu Val Leu lie Asp Gin Met Leu 
815 820 825 

aag tec aac att gtc act gtt ggg gat gta gaa gag ate aat aag acc 2545 
Lys Ser Asn lie Val Thr Val Gly Asp Val Glu Glu lie Asn Lys Thr 
830 835 840 845 



gtt tec cga atg aaa att aca aag get gac tta gaa cag cgc cat cct 

Val Ser Arg Met Lys lie Thr Lys Ala Asp Leu Glu Gin Arg His Pro 

850 855 860 

cag ctg gat tat gtt ttt aca ttg gea cag aat ttg aaa aat aaa get 

Gin Leu Asp Tyr Val Phe Thr Leu Ala Gin Asn Leu Lys Asn Lys Ala 

865 870 875 

tec agt tea gat atg aga aca gca att aca gaa aaa ttg gaa agg gtc 

Ser Ser Ser Asp Met Arg Thr Ala lie Thr Glu Lys Leu Glu Arg Val 

880 885 890 



ctt cag caa gcc cga egg gat cca etc acc aaa eaa att tct gat aac 

Leu Gin Gin Ala Arg Arg Asp Pro Leu Thr Lys Gin lie Ser Asp Asn 
945 950 955 

caa ata ctg ctt eaa gaa etg ggt cct gga gat ggt ate gtc atg gcg 

Gin lie Leu Leu Gin Glu Leu Gly Pro Gly Asp Gly lie Val Met Ala 

960 965 970 



aca agg aat gtg aaa gaa acc aca gag tae tta aaa aca tea tgg ate 
Thr Arg Asn Val Lys Glu Thr Thr Glu Tyr Leu Lys Thr Ser Trp lie 
990 995 1000 1005 

aat etc aaa caa agt att get gac aga cag aac gcc ttg gag get gag 
Asn Leu Lys Gin Ser lie Ala Asp Arg Gin Asn Ala Leu Glu Ala Glu 
1010 1015 1020 



2593 



2641 



2689 



aag aac cag tgg gat ggc acc cag eat gge gtt gag eta aga eag eag 2737 

Lys Asn Gin Trp Asp Gly Thr Gin His Gly Val Glu Leu Arg Gin Gin 

895 900 905 

eag ctt gag gac atg att att gac agt ctt cag tgg gat gac cat agg 2785 

Gin Leu Glu Asp Met lie lie Asp Ser Leu Gin Trp Asp Asp His Arg 

910 915 920 925 

gag gag act gaa gaa etg atg aga aaa tat gag get cga etc tat att 2833 

Glu Glu Thr Glu Glu Leu Met Arg Lys Tyr Glu Ala Arg Leu Tyr lie 

930 935 940 



2881 



2929 



ttc gat aac gtc etg cag aaa etc etg gag gaa tat ggg agt gat gac 2977 
Phe Asp Asn Val Leu Gin Lys Leu Leu Glu Glu Tyr Gly Ser Asp Asp 
975 980 985 



3025 



3073 
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tgg agg acg gtg cag gcc tct cgc aga gat ctg gaa aac ttc ctg aag 3121 
Trp Arg Thr Val Gin Ala Ser Arg Arg Asp Leu Glu Asn Phe Leu Lys 
1025 1030 1035 

tgg ate caa gaa gca gag acc aca gtg aat gtg ctt gtg gat gcc tct 3169 
Trp lie Gin Glu Ala Glu Thr Thr Val Asn Val Leu Val Asp Ala Ser 
1040 1045 ' 1050 

cat egg gag aat get ctt cag gat agt ate ttg gcc agg gaa etc aaa 3217 
His Arg Glu Asn Ala Leu Gin Asp Ser lie Leu Ala Arg Glu Leu Lys 
1055 1060 1065 



cag cag atg cag gac ate cag gca gaa att gat gcc cac aat gac ata 
Gin Gin Met Gin Asp lie Gin Ala Glu lie Asp Ala His Asn Asp lie 
1070 1075 1080 1085 



caa aga tgg aat gac tta aaa gca aaa tct get age ate agg gcc cat 
Gin Arg Trp Asn Asp Leu Lys Ala Lys Ser Ala Ser lie Arg Ala His 
1120 1125 1130 



gaa ctg ate aaa tgg ctg aat atg aaa gat gaa gag ett aag aaa caa 
Glu Leu lie Lys Trp Leu Asn Met Lys Asp Glu Glu Leu Lys Lys Gin 
1150 1155 1160 1165 



tgt aag gcc ctg aga egg gag tta aag gag aaa gaa tat tct gtc ctg 
Cys Lys Ala Leu Arg Arg Glu Leu Lys Glu Lys Glu Tyr Ser Val Leu 
1185 1190 1195 

aat get gtc gac cag gcc cga gtt ttc ttg get gat cag cea att gag 
Asn Ala Val Asp Gin Ala Arg Val Phe Leu Ala Asp Gin Pro lie Glu 
1200 1205 1210 

gcc ect gaa gag cca aga aga aac eta caa tea aaa aea gaa tta act 
Ala Pro Glu Glu Pro Arg Arg Asn Leu Gin Ser Lys Thr Glu Leu Thr 
1215 . 1220 1225 



3265 



ttt aaa age att gac gga aac agg eag aag atg gta aaa get ttg gga 3313 

Phe Lys Ser lie Asp Gly Asn Arg Gin Lys Met Val Lys Ala Leu Gly 
1090 1095 1100 

aat tct gaa gag get act atg ctt caa cat cga ctg gat gat atg aac 3361 

Asn Ser Glu Glu Ala Thr Met Leu Gin His Arg Leu Asp Asp Met Asn 
1105 1110 1115 



3409 



ttg gag gcc age get gag aag tgg aac agg ttg ctg atg tee tta gaa 3457 
Leu Glu Ala Ser Ala Glu Lys Trp Asn Arg Leu Leu Met Ser Leu Glu 
1135 1140 1145 



3505 



atg ect att gga gga gat gtt cca gee tta cag etc eag tat gac cat 3553 
Met Pro lie Gly Gly Asp Val Pro Ala Leu Gin Leu Gin Tyr Asp His 
1170 1175 1180 



3601 



3649 



3697 



ect gag gag aga gcc caa aag att gcc aaa gcc atg cgc aaa cag tct 3745 
Pro Glu Glu Arg Ala Gin Lys lie Ala Lys Ala Met Arg Lys Gin Ser 
1230 1235 1240 1245 

tct gaa gtc aaa gaa aaa tgg gaa agt eta aat get gta act age aat 3793 
Ser Glu Val Lys Glu Lys Trp Glu Ser Leu Asn Ala Val Thr Ser Asn 
1250 1255 1260 
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tgg caa aag caa gtg gac aag gca ttg gag aaa etc aga gac ctg cag 3841 
Trp Gin Lys Gin Val Asp Lys Ala Leu Glu Lys Leu Arg Asp Leu Gin 
1265 1270 1275 

gga get atg gat gac ctg gac get gac atg aag gag gca gag tec gtg 38 8 9 
Gly Ala Met Asp Asp Leu Asp Ala Asp Met Lys Glu Ala Glu Ser Val 
1280 1285 1290 

egg aat ggc tgg aag cce gtg gga gac tta etc att gac teg ctg cag 3937 
Arg Asn Gly Trp Lys Pro Val Gly Asp Leu Leu lie Asp Ser Leu Gin 
1295 1300 1305 

gat cac att gaa aaa ate atg gca ttt aga gaa gaa att gca cca ate 3985 
Asp His lie Glu Lys lie Met Ala Phe Arg Glu Glu lie Ala Pro lie 
1310 1315 1320 1325 

aac ttt aaa gtt aaa acg gtg aat gat tta tec agt cag ctg tct cca 4 033 
Asn Phe Lys Val Lys Thr Val Asn Asp Leu Ser Ser Gin Leu Ser Pro 
1330 1335 1340 

ctt gac ctg cat eec tct eta aag atg tct egc cag eta gat gac ctt 4081 
Leu Asp Leu His Pro Ser Leu Lys Met Ser Arg Gin Leu Asp Asp Leu 
1345 1350 1355 

aat atg cga tgg aaa ctt tta cag gtt tct gtg gat gat egc ctt aaa 4129 
Asn Met Arg Trp Lys Leu Leu Gin Val Ser Val Asp Asp Arg Leu Lys 
1360 1365 1370 

cag ctt cag gaa gee cac aga gat ttt gga cca tee tct cag cat ttt 4177 
Gin Leu Gin Glu Ala His Arg Asp Phe Gly Pro Ser Ser Gin His Phe 
1375 1380 1385 

etc tct acg tea gtc cag ctg ccg tgg caa aga tee att tea eat aat 4225 
Leu Ser Thr Ser Val Gin Leu Pro Trp Gin Arg Ser lie Ser His Asn 
1390 1395 1400 1405 

aaa gtg cce tat tac ate aac eat caa aca cag acc ace tgt tgg gac 4273 
Lys Val Pro Tyr Tyr lie Asn His Gin Thr Gin Thr Thr Cys Trp Asp 
1410 1415 1420 

cat cct aaa atg ace gaa etc ttt caa tec ctt get gac ctg aat aat 4321 
His Pro Lys Met Thr Glu Leu Phe Gin Ser Leu Ala Asp Leu Asn Asn 
1425 1430 1435 

gta egt ttt tct gee tac egt aca gca ate aaa ate cga aga eta caa 4369 
Val Arg Phe Ser Ala Tyr Arg Thr Ala lie Lys lie Arg Arg Leu Gin 
1440 1445 1450 

aaa gca eta tgt ttg gat etc tta gag ttg agt aca aca aat gaa att 4417 
Lys Ala Leu Cys Leu Asp Leu Leu Glu Leu Ser Thr Thr Asn Glu lie 
1455 1460 1465 

ttc aaa cag cac aag ttg aac caa aat gac cag etc etc agt gtt cca 4465 
Phe Lys Gin His Lys Leu Asn Gin Asn Asp Gin Leu Leu Ser Val Pro 
1470 1475 1480 1485 

gat gtc ate aac tgt ctg aca aca act tat gat gga ctt gag caa atg 4513 
Asp Val lie Asn Cys Leu Thr Thr Thr Tyr Asp Gly Leu Glu Gin Met 
1490 1495 1500 

eat aag gac ctg gtc aac gtt cca etc tgt gtt gat atg tgt etc aat 4 5 61 
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His Lys Asp Leu Val Asn Val Pro Leu Cys Val Asp Met Cys Leu Asn 

1505 1510 1515 

tgg ttg etc aat gtc tat gac acg ggt cga act gga aaa att aga gtg 4609 
Trp Leu Leu Asn Val Tyr Asp Thr Gly Arg Thr Gly Lys lie Arg Val 
1520 1525 1530 

cag agt ctg aag att gga tta atg tct etc tec aaa ggt etc ttg gaa 4657 
Gin Ser Leu Lys lie Gly Leu Met Ser Leu Ser Lys Gly Leu Leu Glu 
1535 1540 1545 

gaa aaa tac aga tat etc ttt aag gaa gtt gcg ggg ccg aca gaa atg 4705 
Glu Lys Tyr Arg Tyr Leu Phe Lys Glu Val Ala Gly Pro Thr Glu Met 

1550 1555 1560 1565 

tgt gac cag agg cag ctg ggc ctg tta ctt cat gat gee ate cag ate 4753 
Cys Asp Gin Arg Gin Leu Gly Leu Leu Leu His Asp Ala lie Gin lie 
1570 1575 1580 

ccc egg cag eta ggt gaa gta gca get ttt gga ggc agt aat att gag 4801 
Pro Arg Gin Leu Gly Glu Val Ala Ala Phe Gly Gly Ser Asn lie Glu 
1585 1590 1595 

cct agt gtt cgc age tgc ttc caa cag aat aae aat aaa eca gaa ata 4849 
Pro Ser Val Arg Ser Cys Phe Gin Gin Asn Asn Asn Lys Pro Glu lie 
1600 1605 1610 

agt gtg aaa gag ttt ata gat tgg atg eat ttg gaa eca cag tec atg 4897 
Ser Val Lys Glu Phe lie Asp Trp Met His Leu Glu Pro Gin Ser Met 
1615 1620 1625 

gtt tgg etc eca gtt tta cat cga gtg gca gca gcg gag act gca aaa 4945 
Val Trp Leu Pro Val Leu His Arg Val Ala Ala Ala Glu Thr Ala Lys 
1630 1635 1640 1645 

cat cag gee aaa tge aac ate tgt aaa gaa tgt eca att gtc ggg ttc 4 993 
His Gin Ala Lys Cys Asn lie Cys Lys Glu Cys Pro lie Val Gly Phe 
1650 1655 1660 

agg tat aga age ctt aag cat ttt aac tat gat gtc tge cag agt tgt 5041 
Arg Tyr Arg Ser Leu Lys His Phe Asn Tyr Asp Val Cys Gin Ser Cys 
1665 1670 1675 

ttc ttt teg ggt cga aca gca aaa ggt cac aaa tta cat tac eca atg 5089 
Phe Phe Ser Gly Arg Thr Ala Lys Gly His Lys Leu His Tyr Pro Met 
1680 ~ 1685 1690 

gtg gaa tat tgt ata cct aca aca tct ggg gaa gat gta cga gac ttc 5137 
Val Glu Tyr Cys lie Pro Thr Thr Ser Gly Glu Asp Val Arg Asp Phe 
1695 1700 1705 

aca aag gta ctt aag aae aag ttc agg teg aag aag tac ttt gee aaa 5185 
Thr Lys Val Leu Lys Asn Lys Phe Arg Ser Lys Lys Tyr Phe Ala Lys 
1710 1715 1720 1725 

cac cct cga ctt ggt tac ctg cct gtc cag aca gtt ctt gaa ggt gac 5233 
His Pro Arg Leu Gly Tyr Leu Pro Val Gin Thr Val Leu Glu Gly Asp 
1730 1735 1740 

aac tta gag act cct ate aca etc ate agt atg tgg cea gag cac tat 5281 
Asn Leu Glu Thr Pro lie Thr Leu lie Ser Met Trp Pro Glu His Tyr 
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1745 1750 1755 

gac ccc tea caa tct cct caa ctg ttt cat gat gac acc cat tea aga 5329 
Asp Pro Ser Gin Ser Pro Gin Leu Phe His Asp Asp Thr His Ser Arg 
1760 1765 1770 

ata gaa caa tat gcc aca cga ctg gcc cag atg gaa agg act aat ggg 5377 
lie Glu Gin Tyr Ala Thr Arg Leu Ala Gin Met Glu Arg Thr Asn Gly 
1775 1780 1785 

tct ttt etc act gat age age tec ace aca gga agt gtg gaa gae gag 5425 
Ser Phe Leu Thr Asp Ser Ser Ser Thr Thr Gly Ser Val Glu Asp Glu 
1790 1795 1800 1805 

cac gcc etc ate cag cag tat tgc caa aca etc gga gga gag tec cea 5473 
His Ala Leu lie Gin Gin Tyr Cys Gin Thr Leu Gly Gly Glu Ser Pro 
1810 1815 1820 

gtg age cag ccg cag age eca get cag ate ctg aag tea gta gag agg 5521 
Val Ser Gin Pro Gin Ser Pro Ala Gin lie Leu Lys Ser Val Glu Arg 
1825 1830 1835 



gaa gaa cgt gga gaa ctg gag agg ate att get gac ctg gag gaa gaa 
Glu Glu Arg Gly Glu Leu Glu Arg lie lie Ala Asp Leu Glu Glu Glu 
1840 1845 1850 



cga agg ggg etc cct gtc ggt tea ccg cca gag teg att ata tct ccc 
Arg Arg Gly Leu Pro Val Gly Ser Pro Pro Glu Ser lie lie Ser Pro 
1870 1875 1880 1885 



gag cag cct gaa tct gat tec cga ate aat ggt gtt tee cca tgg get 
Glu Gin Pro Glu Ser Asp Ser Arg lie Asn Gly Val Ser Pro Trp Ala 
1935 1940 1945 



gge cca cag ttc eac cag gca geg gga gag gac ctg ctg gcc cca ccg 

Gly Pro Gin Phe His Gin Ala Ala Gly Glu Asp Leu Leu Ala Pro Pro 
1970 1975 1980 

cac gac ace age aeg gat etc aeg gag gtc atg gag cag att cac age 

His Asp Thr Ser Thr Asp Leu Thr Glu Val Met Glu Gin lie His Ser 
1985 1990 1995 



5569 



caa aga aat eta cag gtg gag tat gag eag ctg aag gac cag eac etc 5617 
Gin Arg Asn Leu Gin Val Glu Tyr Glu Gin Leu Lys Asp Gin His Leu 
1855 1860 1865 



5665 



cat cac aeg tct gag gat tea gaa ett ata gca gaa gea aaa etc etc 5713 

His His Thr Ser Glu Asp Ser Glu Leu lie Ala Glu Ala Lys Leu Leu 

1890 1895 1900 

agg cag cac aaa ggt egg ctg gag get agg atg cag att tta gaa gat 5761 

Arg Gin His Lys Gly Arg Leu Glu Ala Arg Met Gin lie Leu Glu Asp 

1905 1910 1915 

cac aat aaa cag ctg gag tct cag etc cac cgc etc ega cag ctg ctg 5809 

His Asn Lys Gin Leu Glu Ser Gin Leu His Arg Leu Arg Gin Leu Leu 

1920 1925 1930 



5857 



tct cct cag cat tct gca ctg age tac teg ctt gat cea gat gcc tec 5905 
Ser Pro Gin His Ser Ala Leu Ser Tyr Ser Leu Asp Pro Asp Ala Ser 
1950 1955 I960 1965 



5953 



6001 
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acg ttt cca tct tgc tgc cca aat gtt ccc age agg cca cag gca atg 
Thr Phe Pro Ser Cys Cys Pro Asn Val Pro Ser Arg Pro Gin Ala Met 
2000 2005 2010 . 



taa tcactag 



<210> 9 
<211> 2013 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Predicted amino acid 
sequence of a utrophin B isoform "minigene" 



<400> 9 

Met Ser Gly Leu 
1 

Asp Leu Pro Gly 
20 

Glu His Asn Asp 

35 

Arg Phe Ser Lys 
50 

Leu Lys Asp Gly 
65 

Thr Ser Leu Pro 

Asn Val Asn Arg 
100 

Val Asn lie Gly 
115 

Leu Gly Leu Leu 

130 

Met Lys Asp lie 
145 

Leu Leu Ser Trp 

Val Leu Asn Phe 
180 

Val Leu His Arg 
195 

Lys Met Ser Pro 
210 

Thr Ser Leu Gly 

225 

His Leu Pro Asp 

Glu Val Leu Pro 
260 

Thr Leu Pro Arg 
275 

lie Gin Ser Ala 
290 

Thr Pro Ser Thr 

305 

lie Ala Leu Glu 



Ala Ala Thr Thr 

5 

His Val Ala Leu 

Val Gin Lys Lys 
40 

Ser Gly T,ys Pro 
55 

Arg Lys Leu Leu 
70 

Lys Glu Arg Gly 
85 

Val Leu Gin Val 

Gly Thr Asp lie 
120 

Trp Ser lie lie 
135 

Met Ser Asp Leu 

150 

Val Arg Gin Thr 
165 

Thr Thr Ser Trp 

His Lys Pro Asp 

200 

lie Glu Arg Leu 
215 

lie Glu Lys Leu 

230 

Lys Lys Ser lie 

245 

Gin Gin Val Thr 

Lys Tyr Lys Lys 
280 

Val Leu Ala Glu 
295 

Val Thr Glu Val 

310 

Glu Val Leu Thr 

325 



Phe His Trp Lys 
10 

Gin Ala Cys Lys 
25 

Thr Phe Thr Lys 

Pro lie Ser Asp 

60 

Asp Leu Leu Glu 
75 

Ser Thr Arg Val 
90 

Leu His Gin Asn 
105 

Val Ala Gly Asn. 

Leu His Trp Gin 
140 

Gin Gin Thr Asn 

155 

Thr Arg Pro Tyr 
170 

Thr Asp Gly Leu 
185 

Leu Phe Asp Trp 

Asp His Ala Phe 
220 

Leu Ser Pro Glu 
235 

lie Met Tyr Leu 

250 

lie Asp Ala lie 
265 

Glu Cys Glu Glu 

Glu Gly Gin Ser 
300 

Asp Met Asp Leu 

315 

Trp Leu Leu Ser 
330 



Lys Cys Arg Leu 

15 

Arg Leu Pro Asp 
30 

Trp lie Asn Ala 
45 

Met Phe Ser Asp 

Gly Leu Thr Gly 
80 

His Ala Leu Asn 
95 

Asn Val Asp Leu 
110 

Pro Lys Leu Thr 

125 

Val Lys Asp Val 

Ser Glu Lys lie 
160 

Ser Gin Val Asn 

175 ■ 
Ala Phe Asn Ala 
190 

Asp Glu Met Val 

205 

Asp Lys Ala His 

Thr Val Ala Val 
240 

Thr Ser Leu Phe 

255 

Arg Glu Val Glu 
270 

Glu Glu lie His 
285 

Pro Arg Ala Glu 

Asp Ser Tyr Gin 
320 

Ala Glu Asp Thr 
335 
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Phe Gin Glu Gin His Asp lie Ser Asp Asp Val Glu Glu Val Lys Glu 

340 345 350 

Gin Phe Ala Thr His Glu Thr Phe Met Met Glu Leu Thr Ala His Gin 

355 360 365 

Ser Ser Val Gly Ser Val Leu Gin Ala Gly Asn Gin Leu Met Thr Gin 

370 375 380 

Gly Thr Leu Ser Arg Glu Glu Glu Phe Glu lie Gin Glu Gin Met Thr 
385 390 395 400 

Leu Leu Asn Ala Arg Trp Glu Ala Leu Arg Val Glu Ser Met Glu Arg 

405 410 415 

Gin Ser Arg Leu His Asp Ala Leu Met Glu Leu Gin Lys Lys Gin Leu 

420 425 430 

Gin Gin Leu Ser Ser Trp Leu Ala Leu Thr Glu Glu Arg lie Gin Lys 

435 440 445 

Met Glu Ser Leu .Pro Leu Gly Asp Asp Leu Pro Ser Leu Gin Lys Leu 

450 455 460 

Leu Gin Glu His Lys Ser Leu Gin Asn Asp Leu Glu Ala Glu Gin Val 
465 470 475 480 

Lys Val Asn Ser Leu Thr His Met Val Val lie Val Asp Glu Asn Ser 

485 490 495 

Gly Glu Ser Ala Thr Ala Leu Leu Glu Asp Gin Leu Gin Lys Leu Gly 

500 505 510 

Glu Arg Trp Thr Ala Val Cys Arg Trp Thr Glu Glu Arg Trp Asn Arg 

515 520 525 

Leu Gin Glu lie Ser lie Leu Trp Gin Glu Leu Leu Glu Glu Gin Cys 

530 535 540 

Leu Leu Glu Ala Trp Leu Thr Glu Lys Glu Glu Ala Leu Asp Lys Val 
545 550 555 560 

Gin Thr Ser Asn Phe Lys Asp Gin Lys Glu Leu Ser Val Ser Val Arg 

565 570 575 

Arg Leu Ala lie Leu Lys Glu Asp Met Glu Met Lys Arg ■ Gin Thr Leu 

580 585, 590 

Asp Gin Leu Ser Glu lie Gly Gin Asp Val Gly Gin Leu Leu Ser Asn 

595 600 605 

Pro Lys Ala Ser Lys Lys Met Asn Ser Asp Ser Glu Glu Leu Thr Gin 

610 615 620 

Arg Trp Asp Ser Leu Val Gin Arg Leu Glu Asp Ser Ser Asn Gin Val 
625 630 635 640 

Thr Gin Ala Val Ala Lys Leu Gly Met Ser Gin lie Pro Gin Lys Asp 

645 650 655 

Leu Leu Glu Thr Val His Val Arg Glu Gin Gly Met Val Lys Lys Pro 

660 665 670 

Lys Gin Glu Leu Pro Pro Pro Leu Thr Lys Ala Glu His Ala Met Gin 

675 680 685 

Lys Arg Ser Thr Thr Glu Leu Gly Glu Asn Leu Gin Glu Leu Arg Asp 

690 695 700 

Leu Thr Gin Glu Met Glu Val His Ala Glu Lys Leu Lys Trp Leu Asn 
705 710 715 720 

Arg Thr Glu Leu Glu Met Leu Ser Asp Lys Ser Leu Ser Leu Pro Glu 

725 730 735 

Arg Asp Lys lie Ser Glu Ser Leu Arg Thr Val Asn Met Thr Trp Asn 

740 745 750 

Lys lie Cys Arg Glu Val Pro Thr Thr Leu Lys Glu Cys lie Gin Glu 

755 760 765 

Pro Ser Ser Val Ser Gin Thr Arg lie Ala Ala His Pro Asn Val Gin 

770 775 780 

Lys Val Val Leu Val Ser Ser Ala Ser Asp lie Pro Val Gin Ser His 
785 790 795 800 

Arg Thr Ser Glu lie Ser lie Pro Ala Asp Leu Asp Lys Thr lie Thr 

805 810 815 

Glu Leu Ala Asp Trp Leu Val Leu lie Asp Gin Met Leu Lys Ser Asn 
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820 825 830 



He 


Val 


Thr 

835 


Val 


Gly 


Asp 


Val 


Glu 
840 


Glu 


He 


Asn 


Lys 


Thr 
845 


Val 


Ser 


Arg 


Met 


Lys 
850 


He 


Thr 


Lys 


Ala 


Asp 
855 


Leu 


Glu 


Gin 


Arg 


His 
860 


Pro 


Gin 


Leu 


Asp 


Tyr 


Val 


Pne 


Thr 


Leu 


Ala 


III i n 


Asn 


Leu 


Lys 


Asn 


Lys 


Ala 


Ser 


Ser 


Ser 


o ^ c 

o bo 










o r\ 
O / U 










O T C 

Bio 










o o n 
bo 0 


Asp 


Met 


Arg 


Thr 


Ala 
885 


He 


Thr 


Glu 


Lys 


Leu 
890 


Glu 


Arg 


Val 


Lys 


Asn 
895 


Gin 


Trp 


Asp 


Gly 


Thr 
900 


Gin 


His 


Gly 


Val 


Glu 
905 


Leu 


Arg 


Gin 


Gin 


Gin 
910 


Leu 


Glu 


Asp 


Met 


He 

y i D 


He 


Asp 


Ser 


Leu 


Gin 

o n 

y z u 


Trp 


Asp 


Asp 


His 


Arg 

n o c 


Glu 


Glu 


Thr 


Glu 


Glu 

ri o 


Leu 


Met 


Arg 


Lys 


Tyr 

935 


Glu 


Ala 


Arg 


Leu 


Tyr 
940 


He 


Leu 


Gin 


Gin 


Ala 


Arg 


Arg 


Asp 


Pro 


Leu 


Thr 


Lys 


Gin 


He 


Ser 


Asp 


Asn 


Gin 


He 


Leu 


945 










950 










955 










960 


Leu 


Gin 


Glu 


Leu 


Gly 
965 


Pro 


Gly 


Asp 


Gly 


He 
970 


Val 


Met 


Ala 


Phe 


Asp 

975 


Asn 


Val 


Leu 


Gin 


Lys 
980 


Leu 


Leu 


Glu 


Glu 


Tyr 
985 


Gly 


Ser 


Asp 


Asp 


Thr 
990 


Arg 


Asn 


Val 


Lys 


Glu 


Thr 


Thr 


Glu 


Tyr 


Leu 


Lys 


Thr 


Ser 


Trp 


He 


Asn 


Leu 


Lys 






995 








1000 








1005 








Gin 


Ser 


He 


Ala 


Asp 


Arg 


Gin 


Asn 


Ala 


Leu 


Glu 


Ala 


Glu 


Trp 


Arg 


Thr 


1010 








1015 








1020 










Val 


Gin 


Ala 


Ser 


Arg 


Arg 


Asp 


Leu 


Glu 


Asn 


Phe 


Leu 


Lys 


Trp 


He 


Gin 


1025 






1030 








1035 








1040 


Glu 


Ala 


Glu 


Thr 


Thr 


Val 


Asn 


Val 


Leu 


Val 


Asp 


Ala 


Ser 


His 


Arg 


Glu 








1045 








1050 








1055 




Asn 


Ala 


Leu 


Gin 


Asp 


Ser 


He 


Leu 


Ala 


Arg 


Glu 


Leu 


Lys 


Gin 


Gin 


Met 






1060 








1065 








1070 






Gin 


Asp 


He 


Gin 


Ala 


Glu 


lie 


Asp 


Ala 


His 


Asn 


Asp 


He 


Phe 


Lys 


Ser 




1075 








1080 








1085 








He 


Asp 


Gly 


Asn 


Arg 


Gin 


Lys 


Met 


Val 


Lys 


Ala 


Leu 


Gly 


Asn 


Ser 


Glu 


1090 








1095 








1100 










Glu 


Ala 


Thr 


Met 


Leu 


Gin 


His 


Arg 


Leu 


Asp 


Asp 


Met 


Asn 


Gin 


Arg 


Trp 


1105 






1110 








1115 








1120 


Asn 


Asp 


Leu 


Lys 


Ala 


Lys 


Ser 


Ala 


Ser 


He 


Arg 


Ala 


His 


Leu 


Glu 


Ala 










1125 










1130 








1135 




Ser 


Ala 


Glu 


Lys 


Trp 


Asn 


Arg 


Leu 


Leu 


Met 


Ser 


Leu 


Glu 


Glu 


Leu 


He 






1140 








1145 








1150 






Lys 


Trp 


Leu 


Asn 


Met 


Lys 


Asp 


Glu 


Glu 


Leu 


Lys 


Lys 


Gin 


Met 


Pro 


He 




1155 








1160 








1165 








Gly 


Gly 


Asp 


Val 


Pro 


Ala 


Leu 


Gin 


Leu 


Gin 


Tyr 


Asp 


His 


Cys 


Lys 


Ala 


1170 








1175 








1180 










Leu 


Arg 


Arg 


Glu 


Leu 


Lys 


Glu 


Lys 


Glu 


Tyr 


Ser 


Val 


Leu 


Asn 


Ala 


Val 


1185 






1190 








1195 








1200 


Asp 


Gin 


Ala 


Arg 


Val 


Phe 


Leu 


Ala 


Asp 


Gin 


Pro 


He 


Glu 


Ala 


Pro 


Glu 










1205 








1210 








1215 




Glu 


Pro 


Arg 


Arg 


Asn 


Leu 


Gin 


Ser 


Lys 


Thr 


Glu 


Leu 


Thr 


Pro 


Glu 


Glu 






1220 










1225 








1230 






Arg 


Ala 


Gin 


Lys 


He 


Ala 


Lys 


Ala 


Met 


Arg 


Lys 


Gin 


Scr 


Ser 


Glu 


Val 




1235 










1240 








1245 








Lys 


Glu 


Lys 


Trp 


Glu 


Ser 


Leu 


Asn 


Ala 


Val 


Thr 


Ser 


Asn 


Trp 


Gin 


Lys 




1250 










1255 








1260 










Gin 


Val 


Asp 


Lys 


Ala 


Leu 


Glu 


Lys 


Leu 


Arg 


Asp 


Leu 


Gin 


Gly 


Ala 


Met 


1265 






1270 








1275 








1280 


Asp 


Asp 


Leu 


Asp 


Ala 


Asp 


Met 


Lys 


Glu 


Ala 


Glu 


Ser 


Val 


Arg 


Asn 


Gly 










1285 








1290 










1295 




Trp 


Lys 


Pro 


Val 


Gly 


Asp 


Leu 


Leu 


He 


Asp 


Ser 


Leu 


Gin 


Asp 


His 


He 



1300 1305 1310 
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Glu Lys lie Met Ala Phe Arg Glu Glu lie Ala Pro lie Asn Phe Lys 

1315 1320 1325 

Val Lys Thr Val Asn Asp Leu Ser Ser Gin Leu Ser Pro Leu Asp Leu 

1330 1335 1340 

His Pro Ser Leu Lys Met Ser Arg Gin Leu Asp Asp Leu Asn Met Arg 
1345 1350 1355 1360 

Trp Lys Leu Leu Gin Val Ser Val Asp Asp Arg Leu Lys Gin Leu Gin 

1365 1370 1375 

Glu Ala His Arg Asp Phe Gly Pro Ser Ser Gin His Phe Leu Ser Thr 

1380 1385 1390 

Ser Val Gin Leu Pro Trp Gin Arg Ser lie Ser His Asn Lys Val Pro 

1395 1400 1405 

Tyr Tyr lie Asn His Gin Thr Gin Thr Thr Cys Trp Asp His Pro Lys 

1410 1415 1420 

Met Thr Glu Leu Phe Gin Ser Leu Ala Asp Leu Asn Asn Val Arg Phe 
1425 1430 1435 1440 

Ser Ala Tyr Arg Thr Ala lie Lys lie Arg Arg Leu Gin Lys Ala Leu 

1445 1450 1455 

Cys Leu Asp Leu Leu Glu Leu Ser Thr Thr Asn Glu lie Phe Lys Gin 

1460 1465 1470 

His Lys Leu Asn Gin Asn Asp Gin Leu Leu Ser Val Pro Asp Val lie 

1475 1480 1485 

Asn Cys Leu Thr Thr Thr Tyr Asp Gly Leu Glu Gin Met His Lys Asp 

1490 1495 1500 

Leu Val Asn Val Pro Leu Cys Val Asp Met Cys Leu Asn Trp Leu Leu 
1505 1510 1515 1520 

Asn Val Tyr Asp Thr Gly Arg Thr Gly Lys lie Arg Val Gin Ser Leu 

1525 1530 1535 

Lys lie Gly Leu Met Ser Leu Ser Lys Gly Leu Leu Glu Glu Lys Tyr 

1540 1545 1550 

Arg Tyr Leu Phe Lys Glu Val Ala Gly Pro Thr Glu Met Cys Asp Gin 

1555 1560 1565 

Arg Gin Leu Gly Leu Leu Leu His Asp Ala lie Gin lie Pro Arg Gin 

1570 1575 1580 

Leu Gly Glu Val Ala Ala Phe Gly Gly Ser Asn lie Glu Pro Ser Val 
1585 . 1590 1595 1600 

Arg Ser Cys Phe Gin Gin Asn Asn Asn Lys Pro Glu lie Ser Val Lys 

1605 1610 1615 

Glu Phe lie Asp Trp Met His Leu Glu Pro Gin Ser Met Val Trp Leu 

1620 1625 1630 

Pro Val Leu His Arg Val Ala Ala Ala Glu Thr Ala Lys His Gin Ala 

1635 1640 1645 

Lys Cys Asn lie Cys Lys Glu Cys Pro lie Val Gly Phe Arg Tyr Arg 

1650 1655 1660 

Ser Leu Lys His Phe Asn Tyr Asp Val Cys Gin Ser Cys Phe Phe Ser 
1665 1670 1675 1680 

Gly Arg Thr Ala Lys Gly His Lys Leu His Tyr Pro Met Val Glu Tyr 

1685 1690 1695 

Cys lie Pro Thr Thr Ser Gly Glu Asp Val Arg Asp Phe Thr Lys Val 

1700 1705 1710 

Leu Lys Asn Lys Phe Arg Ser Lys Lys Tyr Phe Ala Lys His Pro Arg 

1715 1720 1725 

Leu Gly Tyr Leu Pro Val Gin Thr Val Leu Glu Gly Asp Asn Leu Glu 

1730 1735 1740 

Thr Pro lie Thr Leu lie Ser Met Trp Pro Glu His Tyr Asp Pro Ser 
1745 1750 1755 1760 

Gin Ser Pro Gin Leu Phe His Asp Asp Thr His Ser Arg lie Glu Gin 

1765 1770 1775 

Tyr Ala Thr Arg Leu Ala Gin Met Glu Arg Thr Asn Gly Ser Phe Leu 

1780 1785 1790 

Thr Asp Ser Ser Ser Thr Thr Gly Ser Val Glu Asp Glu His Ala Leu 
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1795 1800 1805 

lie Gin Gin Tyr Cys Gin Thr Leu Gly Gly Glu Ser Pro Val Ser Gin 

1810 1815 1820 

Pro Gin Ser Pro Ala Gin lie Leu Lys Ser Val Glu Arg Glu Glu Arg 

1825 1830 1835 ^ 1840 

Gly Glu Leu Glu Arg lie lie Ala Asp Leu Glu Glu Glu Gin Arg Asn 

1845 1850 1855 

Leu Gin Val Glu Tyr Glu Gin Leu Lys Asp Gin His Leu Arg Arg Gly 

1860 1865 1870 

Leu Pro Val Gly Ser Pro Pro Glu Ser lie lie Ser Pro His His Thr 

1875 1880 1885 

Ser Glu Asp Ser Glu Leu lie Ala Glu Ala Lys Leu Leu Arg Gin His 

1890 1895 1900 

Lys Gly Arg Leu Glu Ala Arg Met Gin lie Leu Glu Asp His Asn Lys 

1905 1910 1915 1920 

Gin Leu Glu Ser Gin Leu His Arg Leu Arg Gin Leu Leu Glu Gin Pro 

1925 1930 1935 

Glu Ser Asp Ser Arg lie Asn Gly Val Ser Pro Trp Ala Ser Pro Gin 

1940 1945 1950 

His Ser Ala Leu Ser Tyr Ser Leu Asp Pro Asp Ala Ser Gly Pro Gin 

1955 1960 1965 

Phe His Gin Ala Ala Gly Glu Asp Leu Leu Ala Pro Pro His Asp Thr 

1970 1975 1980 

Ser Thr Asp Leu Thr Glu Val Met Glu Gin lie His Ser Thr Phe Pro 

1985 1990 1995 2000 

Ser Cys Cys Pro Asn Val Pro Ser Arg Pro Gin Ala Met 

2005 2010 



<210> 10 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 10 

acaggacatc ccagtgtgca gttcg .25 



<210> 11 
<211> 22 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Sequence 
obtainable from non-human mammal 

<400> 11 

gattgtggat gaaaacagtg gg 22 



<210> 12 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 
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<400> 12 

gatgttcctg tgaggccttc gag 
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<210> 13 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 



<400> 13 

cactcttgga aaatcgagcg t 21 



<210> 14 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 14 

actatgatgt ctgccagagt tg 



<210> 15 

<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 15 

gatccaatag cttccttcca tcttt 



<210> 16 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 16 

tggaaaaagt ggaggttgga 



<210> 17 

<211> 20 

<212> DNA 

<213> Artificial 



Sequence 



19 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 17 

tccaacctcc actttttcca 20 



<210> 18 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 18 

gcctggagag ctacatgccc t 21 



<210> 19 
<211> 25 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 19 

ctccacatct ttttcctcat catct 25 



<210> 20 
<211> 22 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 20 

gattgtggtg atggttgtag aa 22 



<210> 21 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<4Q0> 21 

gatgatgagg aaaaagatgt ggag 24 



<210> 22 



20 
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<211> 24 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 22 

aaacccaaaa taacacagga catc 24 



<210> 23 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 23 

agtgtaactt ctctctggtg 20 



<210> 24 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
/ <220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 24 

taagcagatg taggtgatga gc 22 



<210> 25 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> " - . ■ . 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 25 

gctgcttttg ttgtccactt c 21 



<210> 26 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; 
Oligonucleotide 

<400> 26 



21 



atagcttcct tccatctttg ag 



22 



<210> 27 
<211> 23 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 



<400> 27 

Gtccacgttc ttccctctct act 23 



<210> 28 
<211> 28 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 28 

gcgtgcagtg gaccattttt cagattta 



<210> 29 
<211> 27 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 29 

cgctgcagca gccaccacat ttcgttg 



<210> 30 

<211> 28 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide 

<400> 30 

gcgtgcagat cgagcgttta tccatttg 



22 



